Category Archives: Data Science

Top Data Science Programming Languages

Introduction Your preparation for data science would be incomplete without knowledge of programming languages. Although there are many tools in the market for different phases of data science, the power of programming languages cannot be replaced by any of them. Many popular languages perform excellent complex calculations, data analysis, and visualizations, out of which we have chosen the… Read More »

Data Science with Spark

Introduction In this article, we will touch upon all the core concepts of Spark API; however, our focus is to use Spark for data science. The details about Spark provided in the article will be sufficient for you to understand performing Data science with Spark. For data collection and storage, Spark has a very nice distributed mechanism that… Read More »

Data Science for Beginners

Introduction Through this article, we have attempted to cover all the aspects of data science that a beginner should know. Whether you are a fresher or are coming from another job background, you can start learning data science. You can be a self-taught data scientist or learn through the various courses available online and offline. If you have… Read More »

What is Apriori Algorithm in Data Mining? Implementation with Examples

Introduction Apriori algorithm is a type of unsupervised learning algorithm used for association rule mining. The algorithm searches for frequent items in datasets and builds the correlations and associations in the itemsets. We often see ‘frequently bought together’ and ‘you may also like’ on the recommendation section of online shopping platforms – that’s the apriori algorithm! The name… Read More »

Python Data Science Libraries

Introduction Data science, as you must be knowing, is a field that involves a lot of steps that help data scientists derive useful information from data and make business decisions. Typically, the entire data science lifecycle consists of the following steps – Data collection – surveys, crawlers etc. Data preparation – cleaning, wrangling, processing, filtering etc. Data mining… Read More »

Introduction to Classification Algorithms

There are three main types of machine learning algorithms – supervised, unsupervised, and reinforcement. Classification is a type of supervised learning algorithm in which incoming data is classified or labelled based on past data. The algorithm is trained by feeding it with different types of labelled data that can be categorized into different outcomes. For example, we can… Read More »

What are Statistics and Probability?

Introduction Statistics and probability form the central point of data analysis, manipulation, and formatting. Methods like mean, mode, median, standard deviation, etc. are used in almost all data science problems. In this article, we will discuss the most important and widely used methods of statistics and probability for data science. Examples and Importance of data Data is a… Read More »