We hear the buzzword, ‘Data Science’ everywhere today. Data science is said to be THE future.
Many online platforms also recommend learning data science today to secure a high-paying job in the future. There are a plethora of online and offline resources to learn Data Science and other related technologies.
Is it true?
How is data science so important and promising? Why is everybody recommending and learning data science? What is data science after all?
Through this article, we are going to explore the answers to all these questions and learn a lot about how data science is already changing the way big companies work as well as the way we do day-to-day activities.
What is Data Science?
Of course, it has something to do with data. But what?
Suppose you walk into a medical shop with a list of medicines. These are regular medicines that you buy every month for your grandmother. After 2-3 visits, the pharmacist can recognise you and knows exactly what you want. If any medicines are not available, he can suggest you alternatives or pre-order them for you. Since your data is structured and he knows you in person, it is easy for the pharmacist to know your preferences. Imagine in this digital world, where millions of customers login to Facebook, Instagram, shop on Amazon, Flipkart and so on. If the owners were to know the preferences of each user….
- How would they segregate the useful information and the non-useful information?
- How would they understand each user’s preferences?
- How would they provide future solutions based on current inputs?
All this for a huge scale of unstructured data!
This is where Data Science pitches in – it is a technique that uses scientific methods, algorithms and processes to extract the useful information from chunk of data received from various users; processes the required data and provides business insights and potential solutions to business problems.
Can you think of situations where the above can be used – we just saw one – online shopping!
Data science has found its use in many fields today like Healthcare, Telecom, online shopping, gaming, image and speech recognition, AI and IoT.
Top Data Science Applications
1. Recommender systems
Used in most commercial applications, typically online services like Amazon, Netflix, YouTube etc… recommender systems are slowly becoming part of our life by giving us relevant choices based on our preferences. Whether you like it or not, online recommender system will still suggest you what it thinks you would like. An efficient system is believed to generate a lot of income. Of course, we all like people who understand our preferences – for example, my mother knows what I like and makes my meals accordingly. Similarly, if the YouTube recommender system thinks I like a particular genre of songs and plays similar ones, I get happy because I get to listen to all my favourite songs without having to manually create a playlist!
These systems work on,
- Collaborative filtering, where new recommendations are produced based on past user interactions. Such data is stored in user-item interaction matrix and filtered for further use. The result can be simply based on past user data.
- Content-based method – These are based on past interactions plus a little more information about the user.
For example, to show you the kind of movies or shows you might be interested in, Netflix will not only scan your past searches and views, but also understand your age, your gender, location etc… to get the best results. For example, if you are a man, you could be more interested in documentaries, or if you are located in Maharashtra, you might also like shows from an upcoming local comedian. The system builds a model based on observed patterns and predicts user preferences.
We have often heard about incorrect or incomplete diagnosis leading to delay in the right treatment, often leading to a disaster. Thankfully, Data Science has come to the rescue.
While we think that Data Science has taken only the online world with a storm, healthcare is one area where Data Science has been used to its maximum potential and with a lot of success.
Tasks like maintaining computerized medical records, discovery of new medicines, extraction of data from images obtained from different scans, research in new fields like genetics and more accuracy in diagnostics have become easier and more efficient.
Here are some of the areas where Data Science (DS) has brought revolutionary changes –
1. Medical imaging
For major problems of the body, doctors recommend MRI, X-Ray and sometimes CT scan too. Doctors have to manually analyse the report and sometimes it was almost impossible to find the smallest of deformities in these innermost and important parts of the body. This led to improper diagnosis and incorrect treatment.
With the help of scanned images and image segmentation through deep learning techniques in DS, filters like image enhancement, reconstruction and corrections can be applied to the image. Using image processing techniques, it is now possible for doctors to easily see the data and provide correct analysis.
2. New drug discovery
Drug discovery normally requires a lot of effort as well as funding. It is a complex and risky process that needs thorough testing. However, with machine learning algorithms and DS, the insights and predictions about a product are far more accurate that too in way less time. Based on information from patients, various models are developed, trained and tested. New innovations can be predicted based on past data as well as genetics.
3. Keeping track of patient health
Have you heard of IoT? Read about a brief IoT (Internet of Things). IoT is just a concept wherein a device is connected to other devices via the internet, so that each can share and collect data.
With such a concept in mind, let us say, a patient has a wearable band (or device) that can track his/her heart-rate, temperature, BP and lot of other vital parameters and share this data with another device (e.g. a computer) that analyses all the data using Data Science techniques . Using the analysed data, doctors can virtually manage the patient’s health from any location and suggest relevant treatments. Real-time analytics help doctors monitor critical patients and advise future course of action based on their current condition.
4. Virtual assistance apps
A platform where a patient can type their problems and symptoms, and the algorithms then determine the possible diseases based on the data entered by the patient. With such a platform, patients who don’t like to otherwise discuss their problems can also come forward and be anonymous. Most of the times, psychological issues are hard to share and now it has become easy for patients to get solutions to their problems through these platforms (in the form of apps). Some common examples are Ada, Babylon and K Health.
Genome is the genetic material of any organism. It consists of genes and different types of DNA. Genomic data is collected through the study of genomes (called Genomics) using a data processing software for such purpose. Data Science is used extensively to analyse the structure of genomes as well as other vital genomic parameters. This helps researchers understand functions of specific genes. Analysis and processing of such data can be revolutionary for the future of bioinformatic systems.
6. Predictive analysis for improved healthcare management systems
Using predictive data science model, machine learning algorithms can find patterns and produce accurate information about certain symptoms and diseases to help early prevention and measures for cure. It can also help in improving patient care, improve logistics of medical supplies and other pharmaceutical items.
3. Finance Industry
All financial institutions are totally data-driven and financial data scientist is an official position that comes with lot of responsibility and of course money. Their job is most important because it involves risk detection and monitoring, surveillance, fraud detection and prevention, claims, payments, improving overall customer experience and so on. Financial data science will exploit the knowledge of statistics, decision science, predictive analysis etc… to build the right model and produce meaningful and useful results.
1. Risk analytics
With risk analysis, a company can prioritize and make strategic decisions to be a key player in the market. Risk analysis also helps companies understand credit scores of users based on transactions done by them by applying machine learning algorithms to analyse their data.
2. Real-time data analysis
Through data science, finance companies are now able to track user transactions, credit scores, etc…
3. Data management and personalized customer services
Financial institutions need almost all the details of a user. They collect data in various forms which are unstructured or structured. Data science involves handling of the complex unstructured data and uses various techniques like data mining , text analytics etc… to generate insights about the user. Using this customer data, ML algorithms can also analyse market trends. Also, based on the type of transactions and other user behaviour, these institutions strive to provide personalised services and offers to different users, thus creating more business opportunity and profit.
4. Fraud management
Improvements in machine learning algorithms due to their self-learning nature helps data scientists analyse unusual patterns and transactions thus preventing and detecting potential online frauds.
5. Algorithmic trading
Algorithmic trading helps finance companies understand market trends and formulate new business strategies to raise their business to new levels in the future. It consists of analysing huge streams of data to build a model that makes predictions for future market trends.
4. Logistics and Transport
With Big data, a lot of issues with the supply chain mechanism have been eased out. Huge amounts of data is analysed using machine learning algorithms to estimate delivery times, managing inventory by forecasting the demand and optimizing of warehouse, maintenance of assets by finding usage patterns, reducing overall freight costs by optimizing delivery path using IoT for real-time data analytics. This is a great area which needs more exploration and resources as currently there are very few data scientists working for logistics systems. Cisco is one major brand that has successfully managed to fulfil more orders in considerably less time using advanced data science techniques and tools. Cisco refers to this revolutionary achievement as the ‘circle of light’ visualization.
5. Speech and Image Recognition
“Hey Alexa! What’s up?”
Familiar with that?
Yes, Amazon echo, Cortana, Siri – all these are AI powered intelligent systems that recognise speech and interact like humans. If you say anything like ‘Alexa, fix me a hair appointment’ or ‘Alexa, find nearby restaurants’, it will accurately give you the results!
While speech recognition has been around for more than a decade, its popularity and scope has only increased after deep learning and data science techniques have enabled more accuracy and faster results. Speech recognition requires a lot of data and processing the data into text. For example, the tone and pitch vary from person to person. One person can say ‘Alexa’ very fast, while the other can say it slowly. These systems have to recognize different pitches, tones, speeds and much more and hence require a lot of data to build a model and train it for recognition purposes.
So, what about image recognition and processing?
You want to see more information about the fruit Apple on google, and thus type ‘Apple’. However, based on popularity, Google shows you all the links about Apple computers. So now, you have to modify your search and say ‘fruit apple’. This can be boring and sometimes frustrating. Instead, how about if you could just show the picture of apple to Google and get the search results without typing? This kind of visual search is possible through image recognition and matching.
When you upload photos on Facebook, it gives you suggestions on whom to tag by recognizing the faces of your friends! Whoa! How does that happen? Data that is stored by you on Facebook is analysed by Facebook and certain algorithms are applied that help systems recognize who is who. When you set up face lock on your smart phone – you have to take 2-3 photos of yourself in different angles for the phone to identify you correctly. So, imagine the amount of data Facebook processes to accurately identify you and others in a huge group picture! All this is possible with data science and big data where data is analysed, a predictive model is built and the outcome is a set of decisions.
Same way as a human eye remembers and recognizes pictures based on memory, computer recognizes images as a vector image or sequence of pixels (raster).
Image recognition is widely used in security and surveillance, object and gesture recognition, finding visual geolocation, image processing in healthcare and medicine field, industrial automation etc…
Can you think of the kind of data that is stored when you are up for a game? Take Farmville for example. There is a huge amount of game data and state that is stored. Your points, level, play time, interaction time, peak activity time, rest time, diamonds achieved, friends, invitations and so on. Well, all this data just doesn’t lie there simply! Everything is collected and analysed – as someone once said ‘nothing in this world is free’.
Your game data and user data is processed for many reasons – to understand what problems you are facing as a game user and what features can be improved, to suggest you similar games that you might like playing, to identify your pattern and suggest more attractive offers to increase your game time etc… - basically machine learning algorithms process huge amount of data and build a predictive model to identify trends and improve the overall gaming model. Big data analytics also helps determine whether the game is giving the desired results and is yielding profit. Gaming systems also use AI, advanced graphics, image recognition systems and personalized marketing making Data Science inevitable for the gaming world.
7. Digital Marketing
SEO or Search Engine Optimization is another buzzword that we hear a lot today. With a lot of content being poured over the internet every day, Google has made its ranking algorithms harder than ever.
Have you ever scrolled past the first page of results in search of what you are looking for? Most probably no! Sometimes not even beyond the third or fourth result. So, how come some websites remain on top and are visited more often than others?
The answer is – yeah – Data Science!
I had subscribed for Grammarly a few weeks ago, and ever since I get a weekly email report of my performance based on the documents it checks for me – including the number of words corrected, my percentile and improvements in writing. Last week, I did not use it – and today morning I get a mail saying, “It’s been a while since you have used Grammarly.” Inside there is an offer waiting to be grabbed! That is the power of data collection and analysis.
With Data Science tools and techniques, one can get useful insights about the website’s performance, relevant keywords, most searched topics and words, unusual traffic, top conversion paths, redirects and much more. You can also know for how long a user was on your website page, website load time, any errors, user clicks etc... All this data can be visualized, compared and analysed to help you make better content and prepare better SEO campaigns resulting in more traffic for your website. Read more about it here .
8. Master Data Management
Master data is the critical data that is maintained by every organization to carry out their regular affairs. It contains a huge amount of data and should be accurate, consistent and up to date. With companies maintaining multiple databases for storage and reading of data and the amount of data increasing by the day, Master Data Management (MDM) can be thought of as a daunting task.
Only it is not so.
With Data Science, this information can be properly integrated, channelized and analysed leading to a wholesome consumer experience. For example, if you are booking a flight, the next question you are asked is if you want to book a hotel or rent a cab. Same way, if the data collection pattern finds that you are travelling often, they try to give you offers and better deals for a more personalized and hassle-free experience. How can so much data be stored and how is the relevant data picked out of huge chunks of data?
Yes, machine learning algorithms pick relevant information and analyse that for different purposes. Having a master data storage space enables them to gather data from a central place for different business use cases.
9. Product Comparison Sites
As opposed to earlier, when we just typed Amazon.in or flipkart.com to purchase what we wanted, now there are better options to buy the best deals through price comparison websites. These sites compare similar products from various online sites and then show the results in a nice understandable UI so we can make our purchase decision with ease.
How do these product comparison sites get data from so many websites and then filter the data by popularity, price, reviews, features etc…?
Most of the data comes from data feed files from various vendors. This real-time information is collected, organized and structured by the price comparison sites using Data Science techniques like Data crawling to produce the best insights.
Some of the popular price comparison websites are Shopzilla, Shopping.com, PriceGrabber.
10. Augmented Reality
If you have been to the gaming arena of a busy mall, you will surely know what AR or Augmented Reality is. One of the malls where I visited had a simulation roller coaster ride which was amazing – total feel of a roller coaster while sitting in one place – with perfect graphics and sound effects!
It works on perception – what you see is what you believe. With special visual, haptic and audio sensory modalities, you are made to think that you are actually on a roller coaster ride. The motion, audio and visuals are all in sync for a perfect simulation.
The question now is – how is data science helping in the field of AR?
AR applications survive on data – for example location-based or geospatial information to give users a totally immersive and interactive experience. Through Big data, more data points can be turned into analytical insights to identify patterns through visualization. On the other hand, with technology as advanced as AR, huge amounts of data can be represented and interpreted in a better manner using 3D graphics and charts. This application of Data Science is something yet to be fully explored as there are challenges involved. However, the future surely seems bright for the powerful combination of AR and Data Science.
There are many more areas where Data Science continues to make its mark. The gist is that as long as data is of prime importance, data science will continue to flourish and data scientists will have the best jobs in the industry in terms of the kind of work as well as the salary. Data science clearly is the need of the hour today and in the years to come in the form of more advanced AI, self-driving cars, human-like robots, cryptography etc... It will also hold prime importance to revolutionise the fields of agriculture, education, retail and more.
People are also reading: