yyfung94-blog - Tumblr blog

yyfung94-blog · 7 years ago

Text

Superficial Looks - A Case for Data (and Course Reflection)

Often times people say the expressions, “It’s what’s on the inside that counts”. When applied to data, this is certainly true, as the value of the data itself and the practices used to collect it are definitely the most important. But a similar phenomenon occurs with people as it does data, how such data is presented, dressed up, and expressed really controls just how well people receive it and care for it, regardless of its relevance or importance. After all, data is being seen by humans, and used by humans, and as such, humans do value visual impressions highly, as a subconscious choice or not.

We’ve discussed in class some basic ideals that constitute good ways to represent data, similar colors, less clutter, different depictions suited for different messages, fonts, sizes, text, and more. Such things seem trivial, to anyone who cares about their work in any context, at least I thought. Going through lecture and the slides, these concepts seemed trivial, basic even and it seemed quite confusing on why people wouldn’t do these practices, as they probably shouldn’t be standards required, but something everything people ought to do. But such presumptions, as always, cannot be applied to collectives as a whole, as many people don’t deal with presenting data or viewing it. As such, a larger market for data visualization has appeared, as seen in SAS, a sight marketing visual data. Similar services have popped up everywhere, all based on visualizations and ease of access, such as SQUARESPACE. As technology continues to advance further and further, such businesses and advances really does come down to streamlining and convenience, as such qualities sell like wildfire in the modern era. Time is money, time is work, time is invaluable.

As such, in reflection it seems that something I might normally spend time on cleaning up my personal work for myself and easy viewing/ease of access is something not to take for granted, but something I should see as valuable, and a skill I should seek to improve, looking for better ways and tweaks going forward to market not only data to share, but also myself. After all, if humans have time to dress up and market themselves, they should surely have time to market a product they seek to share and sell.

SOME FINAL THOUGHTS:

As most of the class were freshman, taking this class as a senior was quite interesting. Having encountered data and it’s usages in various other classes, I had a lot of exposure to such techniques in a real workplace and in the classroom. I also had exposure to data collection as a psychology minor. Despite this, data was not something I dealt with on a regular basis in the forms constructed during the class, such as XML, and chart creation myself. Funnily enough, in my programming languages class, the professor asked if anyone knew XLST, which I had acquired from this class. Only I raised my hand, and then he joked about it and explained why from a linguistic and code language standpoint it was terrible in almost every way and was no longer used. I had a good laugh out of that. Regardless, even if the probability, statistics, data collection, and processing really were all skills I had covered in probability and statistics classes, this class obviously wasn’t designed for seniors, and would certainly give exposure and be helpful to students later down the line. As such, I really have no complaints about the class and it’s concepts and teachings are definitely helpful, especially to new students, and even to me, an older student. The only real gripe I had with the class is the blog requirements, as it really made no sense to me to force student reflections when their learning and how well they take in the material should be reflected in the homework assignments. Enforcing a word limit is something I also dislike, and having never had a blog before this, having to make a tumblr for a class and write pieces connecting to class while getting outside links to connect to was a struggle, as I could easily write these blogs without outside material.

Link: http://blog.visme.co/examples-data-visualizations/

https://www.sas.com/en_us/insights/big-data/data-visualization.html

0 notes

yyfung94-blog · 7 years ago

Text

Data Analytics, Prediction, Toast, and You

Following the general direction this blog has taken, I figured upon reaching the topic of data prediction and analysis in accordance with standard deviations and confidence intervals, that such a broad and applicable subject would be better presented in the form of personal experience and how I view it, as a computer scientist/software engineer and to give potential insight in just how a startup company seeks to collect data, but utilizes such data, in it’s current form, or in anticipation. Amidst discussion we can also touch on just how larger companies apply these data tools to their strategies and compare to how a larger enterprise might use tools versus a smaller company.

At Toast, it was nearing the end of my term there that they had started up a data analytics team, branched from the scale and data team. While the company was nearing 700 employees, it’s product had gotten to a point where the product itself was sustainable and doing well, but it was time to look forwards. Toast POS, a point of sales service system designed for restaurants, was made to help restaurants streamline their services for faster and better quality service, which in turn increases customer satisfaction and reduce human resource costs. With such basic functionalities steady and constantly added to each individual restaurant, restaurants loyal to the product in return make requests on features and things that they would like to see. As a result, the data analytics team grew, focused on utilizing their current data for restaurants of what customers liked, what they didn’t like, and how restaurants can continue to grow their client base. Based on the plethora of existing data, such as customer age, what they ordered, how many people were at the table per visit, what time of day they ate, the data analytics team was focused on helping restaurants get a better idea of what to prepare for and utilize to grow their success in return for the data they were giving Toast through the point of sales device. On a smaller scale this was already being done, as it is done at all companies using various algorithms (not public) they integrated with various graphing platforms to showcase the data company-wide, making employees aware of what is done, needs to be done, and what the goals should look like in the future. In startups, future planning is one of the most important things, and being able to forecast budget, spending, and revenue is one of the most important things. Hence why it was even more surprising to me that most of this was being solely done by upper management and not data scientists.

In a similar vein, larger companies with larger resources do similar techniques in their field of service, such as Walmart for supply chain, and other companies anticipating their next big product, their next targeted audience, and adjusting on what employees will be necessary to continue growth. Larger companies, as seen in the article, have taken it a step further even, to remove the human element of data analysis and prediction to artificial intelligence, to remove the emotional component. While this has seen some success already, it remains to be seen if the human component does have a better sense of intuition at times, or just a gut feeling based on knowing how other humans think, act and feel, something robots are not quite able to do yet.

As Toast continues to evolve, I look forward to seeing just how they do, and what they do, in contrast to the other larger companies on their path of growth.

Link: https://pos.toasttab.com/pos-system/analytics

https://www.clickz.com/5-businesses-using-ai-to-predict-the-future-and-profit/112336/

0 notes

yyfung94-blog · 7 years ago

Text

Security Versus Data Collection - A Razor’s Edge

Given our discussion of information security in class last week, I figured it’d be good to do a blog on it’s importance, and just what constitutes security despite just what kind of data large websites and companies collect from us, as consumers by just browsing and searching alone. In other various classes, I’ve learned about denial of service attacks, man in the middle attacks, and other various ways people exploit technology to get information they should not have. In our case, I think the discussion of what constitutes hacking and just what services collection information that we might not know, and whether that should concern us or not.

Often times as this is a data information class, we talk about how to gather data and just how to use it. What many people don’t know however, is just how large corporations are using techniques on you, that you may not consent to or know of. Funnily enough, a friend of mine told me this past Tuesday before class that Facebook allows you to see what ads they generated from your browsing and information and what categories they put you in for which brand of audience for advertisements. We joked that it knew him better than he knew himself, which might have been true for certain points in his past that he was not aware of now. It knew what roommates he had, his interest, his family, what environment he was in, what environment his family was in, and more. It was scarily accurate, and a unimaginable amount of data. So just how did they get it?

Whether you know it or not, many companies such as google and popular search engines maintain the right to collect your data when you utilize them. This means everything you searched, every site you visited, everything you interact to, and everyone you talk to. Sounds decent right? From there, they expand further, guessing what you might like through complex algorithms and generate results and advertisements in response to gauge your interest and generate revenue. They put cleverly placed dialog to hide certain images as well, which to your annoyance you hide. More data collected, which then refines their concept of you as a person. Social media, such as Facebook and Youtube do similar things. As larger companies are all about meta-data, a common buzz word, there are no shortages of just what data they can collect from you, whether you are aware of it or not. But is it legal? Yup. According to all terms of agreement when using these services, they have the right to do so. Does it make you slightly uncomfortable? Probably.

But what can we do about it?

Either utilize the services and allow them to collect data, or limit yourself to the modern world of technology. In this dilemma, the decision is quite clear. But as technology advances, there were many things we were uncomfortable with that we know brush off easily. And as such, we can only assume with time even if everyone became aware of such data collection, that they too, would not mind, or have a choice in the matter. But in the end, a common debate is whether the government has the right to look into our lives through our technology, and most people seem to be fine with it or in denial of it if they choose not to. As such, this negative is something that comes with technology, and we, as people and data scientists have to come to agreement with. For if we disallow methods of data collection, then we ourselves, have no right on any basis to collect data on matters where agreement and consent is not always required.

Links: https://www.forbes.com/sites/bernardmarr/2016/03/08/21-scary-things-big-data-knows-about-you/

https://www.villanovau.com/resources/bi/how-is-it-legal-companies-collect-data/

0 notes

yyfung94-blog · 7 years ago

Text

SQLite vs MySQL - What’s the Deal?

In class we’ve utilized the relational database SQLite to represents our schemas and model-entity relationships in mapping our idea of data an objects into a virtual space. When I heard that we would be using SQLite, I instantly recalled a interesting debacle at my recent co-op and it’s usage on Android systems specifically, and why they had started with SQLite, but switched to MySQL. While I was missing a few pieces of the reasons and motivations to moving database application (they had done the switch far prior to my co-op term there) I sought to find the differences between SQLite and MySQL having a near identical use case in either database for this class, and classes prior. While we also utilized PostgreSQL with Docker and containers/ports, we won’t necessarily discuss that here.

So first off, with SQLite and MySQL both being relational databases, what makes SQLite unique? As the name suggests, SQLite is a “lighter” version of MySQL, being much smaller, compact, and providing different functionalities through similar methods. As a more minimalist program, SQLite does not run any background processes (called daemons). This entails a few things. The primary consequence is the main use of SQL, queries. Queries, not passed through a background process to handle them, and calls are directly send and handled by the OS. While this has some overhead, the direct queries are slightly faster, given singular calls are made. When multiple calls are made however, much larger problems can arise, as multiple processes can all be attempting to access the thread running SQLite, locking, thread-safe, and many other practices often avoided can arise.

That being said, SQLite boasts a larger advantage for certain use cases - it’s file based implementation. This means that SQLite is portable, compact, and easy to move around. MySQL on the other hand, involves ports, setup, and various applications along with it’s external dependencies. SQLite also removes user management, which makes usage and processes extremely simple.

From the features above, as engineers there are some glaring problems in production level use. Multi-user management is mandatory in all public development applications. The lack of external features and scaling potential. The lack of supported types compared to MySQL. When it comes to naming schemas, SQLite truly does hold pure to it’s name sake, a more compact, portable, stripped-down version of MySQL.

So if SQLite is just a smaller, compact version of SQL that is less flexible, why use it? SQLite, while scaling less than MySQL, offers a few advantages. Firstly, it’s quick. Given singular application access, SQLite is speedy, easy to use and has a insignificant overhead. With this advantage however, opens up many potential flaws based on usage, due to it’s inherent embedded application nature.

If you’re looking for a quick database application for simple and general use, SQLite may just be what you’re looking for. And if it’s not, even starting in SQLite makes it easy to import into MySQL, as it’s never a bad choice to start if you’re not quite sure what kind of database you need.

Links: https://stackoverflow.com/questions/12666822/what-is-difference-between-sqlite-and-sql

https://www.digitalocean.com/community/tutorials/sqlite-vs-mysql-vs-postgresql-a-comparison-of-relational-database-management-systems

0 notes

yyfung94-blog · 7 years ago

Text

Why Data Scientists Consider R their Weapon of Choice

In class, we have discussed the usefulness and business applications of R programming, a statistical analysis language that allows us to import, clean, analyze, and do much more with any form of data set allowed to integrate with the language. Both articles cited below dive deeper than our discussions in class just who and how professionals utilize R, and we can also discuss just how we can apply R to our own lives, as data scientists whether it is our field of study or not. The term data scientist is a broad term that essentially encompasses all humans, as we perceive everything in the world as data in a variety of forms, although we may not recognize it.

As specified in class, R programming is primarily utilized by data scientists in the streamlining of data. Apart from that however, there are many uses and features of R that make it practical for other applications and contribute to its widespread popularity. Firstly, R is open-source. Open-source software is a term often heard and thrown around as is it is a a buzz word required for people to use their software. R however, is truly open-source, as it is free to use, and allows engineers from all over the world to contribute to it. What this means, is that the language is constantly expanding and integrating to new use cases and necessities, as people can adjust the language or simply add new features they require. This flexibility and ease of integration also makes it a great language to build applications on, with its already essential features for handling data, such functionality is inherited and can be utilized on a higher level platform with a UI, and perhaps more accessible interfaces. As such, a plethora of extensions, packages, and existing applications built on R are available to use for any person, removing the coding-aspect that may ostracize potential uses. Such interfaces with inputs and familiarity make R and easy use even as a framework to just about anyone.

Along with the use cases discussed in class, there are a few other reasons why R is the most prevalent data language, as a minimal language, it needs very little requirements, is easy to use, and eases the barrier of entry from shortages in the field of data scientists. Under the hood, data scientists prefer it for a few reasons:

1. R utilizes vectors, or single directional objects that can hold infinitely many values (in theory). This forgoes looping, and as such as remove a lot of dangers involving infinite loops and overflow.

2. R is interpreted, meaning it is compiled as you code it. This means you don’t need a compiler, nor anything special at run time. As such, the language runs faster and is easier to manage.

As such, R is the staple programming language for data scientists, regardless of field. With all this talk about programming, the term data scientists seems to narrow down just who uses these applications. This however, is not the case, as we discussed earlier, data scientists is a very general term. Despite the code-wise advantages R offers and extensibility, the language itself is written in a very simple way, for all scientists to use, as all scientists are required to deal with data.

Links: https://elearningindustry.com/applications-r-programming-r-eal-world

https://www.quora.com/What-are-some-real-world-applications-of-R

0 notes

yyfung94-blog · 7 years ago

Text

Database Applications and their the Future

With an exam, this week’s lecture content was less than usual. Regardless, we touched upon SQL, it’s uses, and just why it is the standard for a relational database. As we discussed in the week prior and in my previous post, relations are most familiar to a logical way we perceive and map ideas in our minds. And yet, anyone in the industry knows, the larger the data, the larger the tables, and as a result, a less efficient, poorer performance, SQL tables and queries are generated as a result. So we can discuss, in what context is SQL a great resource? At what limit does it begin to slow? What alternatives are there and what do those alternatives bring to the table? Let’s find out.

So why is SQL the most commonly used database for all companies ranging from small to large? First off, the schemas and relations are easy to create, read, and understand. Furthermore, the use of primary keys allow for unique identifiers between actual items and their representations. With such keys and specific tables, the queries done do not store any data, and can be performed very quickly with low cost. Inbuilt tools, like JOINs and table locks further increase the positive qualities of SQL, all while requiring very little code written to create the database and queries themselves.

For all the great qualities SQL has, it is not without it’s shortcomings. View dependency on tables, and actual memory usage are all potential disadvantages to SQL, as memory requires money, and dependencies require maintenance. Furthermore, in the modernized world, all data needs to be reachable through an interface, which SQL does not provide in a clean, simple manner. The largest flaw of SQL however, as engineers encounter, is the lack of flexibility. For any amount or form of information, a schema needs to be made, and open implementation must be added for it to be flexible to prevent a database design overhaul. As a result of this modernization and requirement increase, larger companies have started branching out to other database technologies, most notably NoSQL databases.

So what is a NoSQL database? “A NoSQL or Not Only SQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.” So what does this actually mean? NoSQL acts like it’s name dictates, a opposite idea to SQL, a database representation with no schema and no relations in the actual models. As a result, the database is extremely scale-able as companies handling big data (like Amazon) require. But without keys, schemas, and relations, how do users query data? How is it efficient?

First off, different data models can be stored in the database, regardless of the form. This allows each model to only be used for it is intended for, not a coerced all-in-one model given extra fields to fit multiple object forms, as most programming languages represent items in class-based objects. So NoSQL offers us scaling and a more flexible model, but how does it match SQL in speed and efficiency of queries? Looking at SQL, the queries are designed to be quick and efficient, and in many senses, they are. However, when migrations are run over large sets of data, speeds slow 10 fold, with queries being dozens of lines long, going through numerous tables of joining to get our desired data through one key. As such, SQL hits a limit on just how useful it is when billions of transactions a day are being performed. NoSQL however, utilizes a key-value storage model, which is simple and straight forward. Instead of joining multiple tables and fields, you have a key, you get what you’re looking for. Simple.

NoSQL does have it’s disadvantages as well, with a decrease in consistency in favor of availability, as the CAP theorem states. One must be sacrificed for the other, and you cannot have both.

With all this being said, it sounds like NoSQL is the better solution. While this seems like the case, NoSQL is young, underdeveloped, and yet to hit substantial walls when it is explored more and more. There is little support for it with the few notable public implementations such as MongoDB, and the industry continues to expand. In that sense, the faults and flaws of NoSQL may yet to be discovered, as SQL has persevered throughout the ages as the most uses database. Perhaps old and reliable really is the best alternative.

Links: https://www.quora.com/What-are-the-advantages-and-disadvantages-of-SQL

https://www.devbridge.com/articles/benefits-of-nosql/

0 notes

yyfung94-blog · 7 years ago

Text

XML and the Evolution of Data

In the past week and a half of class we’ve been discussing XML, it’s use, and how to use it. What we haven’t discussed much however, is the area of data sharing and delivery, which is the primary use for the XML format. As XML was described in lecture as a staple of data representation, which it still is, this is primarily due to the fact that XML was one of the sole options and robust solutions for this problem of sending data at the time. The language was simple, easy, and free for everyone to use.

Throughout my (while short) professional career, the use of XML has been almost non-existent. In classes and API usage, data is almost always sent in a JSON (JavaScript Object Notation) format, with XML being occasionally offered (such as Jenkins) but unused. So why is JSON so popular and so heavily utilized? What differentiates it from XML?

So reviewing the pros of XML, there are many reasons why XML is so influential. It holds attributes, metadata, tags, and above all, offers structure, validity, schemas, and as a language, is extensible to other platforms beyond just data transmission. In terms of the niche market of data transmission for which many do use XML, this is not without it’s drawbacks. The formatting and abundance of data often makes the structure harder to transverse via XPATH and a larger total object when sending an XML document, which is becoming more and more critical in the modern world when it comes to speed and efficiency.

JSON, on the other hand, is a much simpler, data format. It essentially consists of what looks like a map, with key pair values, and that’s it. So what does this offer us? For one, in terms of data sending, the structure is straightforward and easy to access with trees of objects and arrays, which can then contain more ‘leaves’. Due to this bare-bones representation, the information format is also serialized and de-serialized much faster, which as we touched on earlier, means a world of importance to the developing world of speed over the internet. This easier and quicker way of reading data leads to a lower work load, and a higher efficiency. Perhaps one of the largest advantages JSON has is that it also offers the use of arrays. Arrays are easily created mapped, and accessed within a JSON object, and while it is possible to represent in XML, requires much more markup, code, and work. Despite these advantages, JSON’s flexible, compact curly brace structure also leads to loose constraints, or potential errors in data that you won’t see in XML.

So is JSON better than XML? In terms of data transfer, almost certainly JSON, and a majority of computer scientists would answer the same. A large number of computer scientists however, would also argue that the comparison is unfair, as XML’s primary use is not data transfer. This comes back to the point that XML is a markup language, with it’s use in data transfer is a mark of it’s extensibility and exceptional design. Despite this, the usage of XML is still on a decline, as new languages and technologies are developed for more and more specific cases, such as JSON for data transfer. While XML is still utilized over the web, it’s not hard to see the decline of it’s usage. Although XML is certainly on a decline, the fact that it is still implemented and utilized acts as a testament to just how amazing and impactful XML has been on the evolution of the internet and data.

Links: http://www.yegor256.com/2015/11/16/json-vs-xml.html

http://www.cs.tufts.edu/comp/150IDS/final_papers/tstras01.1/FinalReport/FinalReport.html

0 notes

yyfung94-blog · 7 years ago

Text

Data Modelling - Ways in Which We Simplify the World

Through the past week we’ve discussed in class just how we reduce information we receive in the world and the various methods we use to make the data easier to process and utilize. Having discussed the use of Data Modelling the planning of software production, one thing we have yet to see however, is just how the effects of Data Modelling are applied outside software and how we can apply it to our daily lives.

Often times when we as humans choose to set out on a goal or project, we think of what we need to do and how to do it. Through various forms, we, as humans have come to learn that preparation and planning is often the key to success. As such, Data Modelling presents itself as a key factor that can vastly improve your progress towards a goal in a short time. Ironically, this is something most people outside of the computer science field apply specifically, despite it being a multidisciplinary approach. In some ways however, people do apply such schemas and themes subconsciously, as our mind labels information we take in and creates associations. This property of the brain that is so powerful (still a mystery how though) has been studied extensively, these features which may or may not be the founding idea behind database schemas (unsure of this) have both good and bad faults, similar to databases. On the other hand, while difficult, the brain is easier to change in creating new associations than the various pitfalls of inflexible database design discussed earlier in lecture.

So how do we create schemas in our mind? And if we do in what ways can a physical representation to help us in projects and goals? A common measure of cognitive ability in children is the testing of schemas and relations. What do you think of when you imagine a car? Got that image? Good. What does it look like? Despite the fact that you know there are extensive types of cars in every shape, size, and style, one image of a car popped up. Why? This is the benefit of a schema. It allows us as humans to create associations, reference them, and project our mental representation of them easily. This in turn, allows to easily take in new information and utilize it as well, based off our existing info. In this way our brain is extremely flexible, never constrained to a memory limit or simple fields or keys mapping to other tables. In that sense, our brain is more similar to a noSQL database, with spread out data all mapped and connected to each other, ready to rebind or drop references at any time. Databases however, are not so fortunate. Adding a new field can generate hundreds of problems, especially in the case of existing data. The resulting NULL fields are often disastrous, given a migration misses some set of data. Fortunately for humans, we can simply say, “I don’t know”.

Thus the question remains, if our brain is so much farther ahead of databases and their more rigid representations, why do we bother? Can we really benefit outside of schema usage in virtual data representation? Of course we do, due to one large glaring wall in our faces. Because we are humans! Despite all the amazing feats and difficulties humans can overcome, we do have restrictions. These mental schemas are often hard to change, and also individualized towards ourselves. Why one might imagine a coconut when the word ‘nut’ comes up may be hard to understand for many, but not for a select few. In that sense, we need a broader representation, and physical schemas can do just that. It makes it easier to get people on the same page and understand the direction, something infamously hard for projects. Furthermore, psychological studies show just how hard it is to change our mental schemas, therefore creating more problems when people need to decide on a collective idea. If everyone were to simply take other ideas and put them in their own, there would be no real way of knowing if everyone really did have the same idea.

Furthermore, we, as humans, process an overwhelming amount of information. Such extraneous data often disturbs our concentration and memory, whether we have such associations to information or not. Therefore it is imperative that such a physical manifestation be easily accessible for us to visualize and instantly recognize/set ourselves in the correct mindset.

Across multiple fields, Data Modelling can set up many for success. Whether you are in biology, psychology, or art, Data Modelling is for you.

Links: https://www.linkedin.com/pulse/why-data-modelling-important-munish-goswami

http://jamesclear.com/schemas

0 notes

yyfung94-blog · 7 years ago

Text

The Significance of Data Collection in our Lives

Often times I am reminded of the importance of data in the world, across all interdisciplinary fields globally. Despite being a Computer Science student, I often forget just how data is valued from something as simple as our eyesight and how our brain processes it versus data stored in a database off in a cloud service hosted in a server warehouse thousands of miles away. In hearing about information collection techniques I have come across in the real world during lecture, I was reminded of my first exposure to data collection, AP Psychology in high school and my starting college major as a Health Science student in the Bouve College at Northeastern University. My first exposure to information collection, while I never labeled it as such, was likely studies about Skinner’s Box, and how studies in the past detailing their findings were relevant even now, used as foundations for all of modern psychology, despite not being permissible.

Skinner’s Box details the myth that Skinner, a well known psychologist, raised his child in morally wrong environments (in this case, an actual box) in order to get data from it in his studies to see the effects of surroundings on growing children. The novel also entails many of the more historic studies in history, that led to harsher restrictions and regulations on what is allowed in experiments, rules that govern even now, what constitutes moral acceptability. The most well known experiment that people know is the torture experiment, where an individual is told to press a button they believe is electrocuting someone to death. This ultimately caused emotional harms to the volunteer, and thus was deemed immoral and banned.

It also dawned on me that just about everything we do is a form of data collection, from my self-learning project in high school teaching myself guitar while documenting my progress, to simply touching a hot stove. All the information we receive is processed and used going forwards, as done in all studies of psychology. More recently however, my past 2 co-ops have had interviews through different mediums, online and in person. By gaining experience in interviews, I get a better idea of what kind of information interviewers are looking for, how to give them what they want, and how to give interviews myself. I also was given the opportunity to tag along in interviews for full time employees, which really set a tone in how to approach the form of information collection, and how to handle different individuals. Likewise, the company itself, Toast POS, a rising Boston startup, used global surveys quite frequently to often gauge the satisfaction and reviews on company events and running, also to see if teams were meeting criteria. This was a great way to get feedback and learn about employee satisfaction, if the company was meeting their needs via 1-on-1′s with advisers to discuss self-growth, or simply talk about their concerns. In that sense, almost all data is valuable, even if skewed or incorrect, because that can be used in future as outliers or detection to help get a better average.

Despite the fact that in our lives, we are working with information collection and data every second of our lives, we often fail to notice this as such is a nature of life and a core component of how we live. Being about to talk about such ideas however, in lecture, is a great reminder and treat of being able to take a step back and reflect on how we do it, why we do it, and how to improve not only our duties and capabilities as data scientists, but as intelligent beings.

Links: https://en.wikipedia.org/wiki/Opening_Skinner%27s_Box

https://pos.toasttab.com/

0 notes

yyfung94-blog · 7 years ago

Text

Data Modeling, Excel, Decision Trees

I often consider myself an Excel savant. Pie Charts? Check. Line Graphs? Done that. Visual Basic Macros and real world experience coding? QED. Coming into IS2000, Principles of Information Science, I didn’t expect to learn much coming into a freshman level class as a Computer Science senior. After all, I’d done it all, heavy workloads, late nights coding, team projects, and accumulated real world experience with stellar reviews. Though class has only been in session for two weeks however, I’m beginning to connect dots from this class to new learning outside of computer science, constantly reminded why computer science and informational studies are interdisciplinary.

I’ve known many friends and colleagues from computer science that often involve themselves in big data or business analytics, and I often questioned why. Computer science should be about coding right? Creating things. Breaking things. Evolving technology to better the world and improving the quality of our own lives. I’ll admit, with a brother getting his Masters in Business, I’d recognized that I would need to familiarize myself in the industry sooner or later, part of the “adulting” process. And as I’ve progressed through this week’s lectures, I’m making the connection not to coding of any sorts as the course is not intended to, but rather real world business decisions in uses that may help further my future career further. Not in how to use Excel, but rather seeing real world examples of how these strategies are employed (see the Gerber article). From my computer science background and extensive (yet mandatory) course load on algorithms, data, and logic and computation, The relation of how this data is handled so far in organization seems to connect to similar tree types on a more elementary level before it is stored and connected through such structures as binary trees. The connections however, are seemingly there.

In the class so far, none of the information presented has been new to me, but the structure of the class does this as a building block for new learners. While for some in the class such as myself this may be review, I’m looking forward to learning new applications of such strategies, albeit from a business perspective and not as a computer science student.

Links: https://en.wikipedia.org/wiki/Binary_tree

https://gbr.pepperdine.edu/2010/08/how-gerber-used-a-decision-tree-in-strategic-decision-making/

0 notes