admin – My Space

The data life cycle is a framework that outlines the stages that data goes through from its initial creation or capture to its eventual deletion or archival. Here are the typical steps in the data life cycle: 2. Data Ingestion: 3. Data Storage: 4. Data Processing: 5. Data Analysis: 6. Data Visualization and Reporting: 7.

Data Lifecycle Read More »

Langages of Data Science

Leave a Comment / Uncategorized / admin

The languages of Data Science For anyone just getting started on their data science journey, the range of technical options can be overwhelming. There is a dizzying amount of choice when it comes to programming languages. Each has its own strengths and weaknesses and there is no one right answer to the question of which one you should

Langages of Data Science Read More »

Practical Consideration in K Means Algorithm

1 Comment / Uncategorized / admin

Let’s understand some of the factors that can impact the final clusters that you obtain from the K-means algorithm. This would also give you an idea about the issues that you must keep in mind before you start to make clusters to solve your business problem. Thus, the major practical considerations involved in K-Means clustering

Practical Consideration in K Means Algorithm Read More »

Steps of the Algorithm

Leave a Comment / Uncategorized / admin

Let’s go through the K-Means algorithm using a very simple example. Let’s consider a set of 10 points on a plane and try to group these points into, say, 2 clusters. So let’s see how the K-Means algorithm achieves this goal. [Note: If you don’t know what is meant by Euclidean distance, you’re advised to

Steps of the Algorithm Read More »

Centroid

Leave a Comment / Uncategorized / admin

The next concept that is crucial for understanding how clustering generally works is the idea of centroids. If you remember your high school geometry, centroids are essentially the centre points of triangles. Similarly, in the case of clustering, centroids are the center points of the clusters that are being formed. Now before going to the

Centroid Read More »

K-Means clustering

Leave a Comment / Uncategorized / admin

Euclidean Distance In the previous segments, you got an idea about how clustering works – it groups the objects on the basis of their similarity or closeness to each other. Now, the next important thing is to get into the nitty-gritty of how clustering algorithms generally work. You will learn about the 2 types of

K-Means clustering Read More »

Cost Function

1 Comment / Uncategorized / admin

Cost Function We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x’s and the actual output y’s. J(theta_0, theta_1) = dfrac {1}{2m} displaystyle sum _{i=1}^m left ( hat{y}_{i}-

Cost Function Read More »

Model and cost function

Leave a Comment / machine learning / admin

Model Representation To establish notation for future use, we’ll use x^{(i)}x(i) to denote the “input” variables (living area in this example), also called input features, and y^{(i)}y(i) to denote the “output” or target variable that we are trying to predict (price). A pair (x^{(i)} , y^{(i)} )(x(i),y(i)) is called a training example, and the dataset

Model and cost function Read More »

Unsupervised Learning

Leave a Comment / Uncategorized / admin

Unsupervised Learning Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables. We can derive this structure by clustering the data based on relationships among the variables in the data. With

Unsupervised Learning Read More »