SigmaWay - Recent Blog Posts - SigmaWay Blog - Page 44

The Art of Predictive Modelling

Wednesday, 08 March 2017

Your perspective on data depends on the type of task you want to accomplish. They could be broadly specified as: Analytics : Helps you explore what happened and why.

Monitoring : Looking at things as they occur to find abnormalities.

Prediction : To predict what might happen in future.

Some of the most popular algorithms that can be applied to a predict future trends are :

The Ensemble Model : It uses multiple model output to arrive at a decision , however, one has to understand how to pick correct models and what problem does one want to solve.

Unsupervised Clustering Algorithms : These algorithms help to group similar people and objects together.

Regression Algorithms: These are used to predict future values of a product/service

There is no ideal formula to find the best suitable method for predictive analytics. A strong level of business expertise is required to master ‘art’ of predictive modelling. Read more at: http://www.analyticbridge.com/profiles/blogs/the-ultimate-guide-for-choosing-algorithms-for-predictive

2985 Hits

0 Comments

Tags:

Data Value Chain for GeoSpatial Data

Wednesday, 08 March 2017

Ashish Pahwa

Analytics

The value of data has changed over time. Companies have realized that collecting, analyzing, sharing, selling data and extracting actionable insights is critical to the development of their organization. Geospatial data is captured and analyzed by engineers and product managers to develop creative solutions and thus increasing productivity. People can view the flow of geospatial data from the instant it is collected throughout its lifecycle using a framework known as 'Data Value Chain'. Data intersects with analytics and can turn this information into decisions. A technological ecosystem built around a geospatial system provides new ways to work and reduce costs, accelerate schedules and supply high-value deliverables along the value chain. Read more at : http://dataconomy.com/2017/02/power-of-data-value-chain/

4162 Hits

0 Comments

Tags:

Geospatial data data value chain analytics

Data Science Challenges in Production Environment

Wednesday, 08 March 2017

Ashish Pahwa

Analytics Technology

A very little time is spent on thinking about how to deploy a data science model into production. As a result, many companies fail to earn the value that comes from their efforts and investments. In production environment data continuously comes, result are computed and models are frequently trained. The challenges faced by companies fall into four categories: Small Data Teams: They mostly use small data, often don’t retrain models and business team is involved in a development project.

Packagers: Often build their framework from scratch and practice informal A/B testing , generally not involved with the business team

Industrialization Maniacs: These teams are IT led and automated process for deployment and maintenance , business team are not involved in monitoring and development

The Big Data Lab : Uses more complex technologies , business teams are involved before and after deployment of data product

Companies should understand that working in production is different than working with SQL databases in development , moreover real time learning and multi-language environments will make your process complex. Also a strong collaboration between business and IT teams will increase your efficiency. Read more at : http://dataconomy.com/2017/02/value-from-data-science-production/

4557 Hits

0 Comments

Tags:

data science production environment data science model

Rise of Data Science Platforms

Thursday, 23 February 2017

Ashish Pahwa

Technology

Data science platform has become a buzzword of the decade. So, what is it? The sole purpose of a data science platform is to encapsulate all off-data science work by incorporating tools required to visualize, deploy, collect, analyze data, build models, generate reports. This toolkit makes it convenient to maintain, reproduce and scale up the project and produce results dynamically. Adoption of data science platforms is expected to grow almost double by 2018 as more companies realize its potential benefits. Many data driven business faces the challenge of effectively utilizing data science tools and lack integrated approach to their data science technology stack to find value in the data. While on the other hand, companies who have already established data science platforms are excelling in the field.

2995 Hits

0 Comments

Tags:

Data Science Platforms data science tools

Deploying Machine Learning On Real Time Systems

Monday, 20 February 2017

Ashish Pahwa

Technology

The three critical steps involved in deployment of machine learning algorithm and exposing it to real world are :

Define a goal based on a metric : Decide if you want human level intelligence or an acceptable one as this decision will affect time and engineering cost of your system. Also define a metric to measure performance of your model.

Build the system : Build a minimum viable system without worrying much about accuracy. Then build an incremental strategy to improve your system by solving problems you face in each iteration.

Refine the system with more data : Initial metric values are not the indicators of real life, your data and users might change , so regularly monitor the system performance. Update it with new data and fine tune the model accordingly.

3176 Hits

0 Comments

Tags:

real time system machine learning algorithm machine-learning

Enhancing Artificial Intelligence using Ensemble Training

Monday, 20 February 2017

Ashish Pahwa

Technology

Sometimes even the Machine learning algorithms behave so dumb that an image recognition model can be confused by generating an adversarial instance, i.e. by changing few pixels by either taking derivative of model output or exploiting genetic algorithms. Adversarial instances lie in low probability regions which is in contrast with limited instances of high probability regions from which the model was trained. A possible approach to solve this problem is ensemble training - To let multiple models back each other. As we look forward to developing more artificial intelligent systems it would become common to encounter such problems.

2961 Hits

0 Comments

Tags:

machine-learning Artificial Intelligence ensemble training

Effective Quality Management using Hypothesis Test

Saturday, 18 February 2017

Ashish Pahwa

Analytics

A business hypothesis is a foundational theoretical concept whose good understanding helps you to achieve business goals. For instance, it provides a mathematical way to answer questions like whether you should spend on advertising or whether increasing a price of a product will affect your customers. Data collection is one part of the game, but correct data processing and interpretation is the final stage of your decision-making process. Hypothesis testing is used to infer whether there is enough data to support evidence . There are various test methods : Parametric Tests - z-test, t-test, f-test. Non Parametric Tests - Wilcoxon Rank-sum test, Kruskal-Wallis test and permutation test.

3538 Hits

0 Comments

Tags:

qualtity management hypothesis test decision making data processing business hypothesis

Hadoop Architecture for Big Data Analytics

Saturday, 18 February 2017

Ashish Pahwa

Technology

The emergence of massive unstructured data sources like Facebook and Twitter has created a need to develop distributed processing systems for Big Data Analytics. Hadoop (A Java based programming framework) has become the first choice of developers and industry experts mainly because its: Highly scalable, flexible, and cheap. An application is broken down into various small parts which runs on thousands of nodes to achieve fast computing speed and reduce overall operation time. Hadoop architecture continues to operate even if a node fails. Its incredible design allows you to process large volumes of data and extract computationally difficult features of users/customers.

3622 Hits

0 Comments

Tags:

Big Data Hadoop Big Data Analytics unstructured data

Good Statistical Practice

Wednesday, 08 February 2017

Vasudev Singh

Analytics

You can’t be a good data scientist unless you have a good hold on statistics and have a way around data. Here are some simple tips to be an effective data scientist:
Statistical Methods Should Enable Data to Answer Scientific Questions - Inexperienced data scientists tend to take for granted the link between data and scientific issues and hence often jump directly to a technique based on data structure rather than scientific goal.
Signals Always Come with Noise - Before working on data, it should be analysed and the actual usable data should be extracted from it.
Data Quality Matters - Many novice data scientists ignore this fact and tend to use any kind of data available to them, if always a good practice to set norms for quality of data.
Check Your Assumptions - The assumptions you make tend to affect your output equally as your data and hence you need to take special care while making any assumption as it will affect your whole model as well as results.
These are some of the things to keep in mind when working around with data. To know more you can read the full article by Vincent Granville athttp://www.datasciencecentral.com/profiles/blogs/ten-simple-rules-for-effective-statistical-practice

3129 Hits

0 Comments

Tags:

data science Analytics Tips and Trick analytics Statistical Practices

Scaling Data Models in Production Environment

Tuesday, 31 January 2017

Ashish Pahwa

Technology

Often the outputs of data models developed by data, scientists end up in a report which summarizes the state of business and used by stakeholders to make decisions. But it is necessary to achieve a system that can predict the future outcomes in real time. This can be done by integrating the model in a production environment, however, it requires advance engineering skills and data scientists cannot do it alone. The process of deployment follows broadly 7 steps : 1.Refactor the model code

2. Walk through the code and determine how it slots into the engineering cycle

3.Re-write into a production stack language or PMML

4.Implement it into the tech stack

5. Test performance

6. Tweak the model based on test results

7.Slowly roll out the model.

Today many companies are adopting tools to make this process faster to reap the benefit of data driven decision making.

2990 Hits

0 Comments

Tags:

Data Scientist data driven predict future data driven decision

Recommenders : The Future of E-commerce

Friday, 27 January 2017

Vasudev Singh

Analytics

Recommender systems have become the backbone of the ecommerce sector. They have helped companies like Amazon and Netflix to increase their revenue to as much as 10% to 25%.
And hence the need of the hour is to optimize their performance.
So, what are recommenders? Recommenders are the applications which personalize your customer’s shopping experience by recommending next best options in light of their recent buying or browsing activity. Recent developments in analytics and machine learning have let to many state of the art recommender systems.
Types of Recommenders: There are broadly five types of recommender systems, which are as follow:
1. Most Popular Item
2. Association and Market Basket Models
3. Content Filtering
4. Collaborative Filtering
5. Hybrid Models

In coming years, recommender system will be used by almost every organisation, whether it's big or small, and will become an inseparable part of the ecommerce world.

To know more read the article by William Vorhies at: http://www.datasciencecentral.com/profiles/blogs/understanding-and-selecting-recommenders-1

3320 Hits

0 Comments

Tags:

data science Analytics Tips and Tricks ecommerce Recommender Systems

2016: The year of Deep Learning

Friday, 27 January 2017

Vasudev Singh

Analytics

2016 has been the year of deep learning, some big breakthrough were achieved in 2016 by Google and DeepMind.Some of the most significant achievements are as follow :

AlphaGo triumphs Go showdown : AlphaGo the google’s AI for the game Go to everyone’s surprise was able to beat Go champion Lee Sedol.

Bots kicking our butts in StarCraft : DeepMind AI bots were able to outperform some of the top rated StarCraft II players.

DIY deep learning for Tic Tac Toe : AlphaToe a AI bot was able to outperform most of the people that played with it.

Google’s Multilingual Neural Machine Translation : Google was able to make a model which is capable of translating text b/w languages, reaching a new milestone in linguistics and NLP.

Hence , in a nutshell , 2016 was the year for Deep Learning and a lot of unachievable milestone were conquered during the annual year.

To know more you can read the full article by Precy Kwan at http://www.datasciencecentral.com/profiles/blogs/year-in-review-deep-learning-2016

3570 Hits

0 Comments

Tags:

data science machine-learning Deep Learning AlphaGo DeepMind

A Guide to Choosing Machine Learning Algorithms

Friday, 27 January 2017

Vasudev Singh

Analytics

Machine Learning is the backbone of today’s insights on customer, products, costs and revenues which learns from the data provided to its algorithms. And hence algorithms are the next most important thing in data science after data.
Hence , the question which algorithm to use ? Some of the most used algorithms and their use cases are as follow :

1) Decision Trees - It’s output is easy to understand and can be used for Investment decision ,Customer churn ,Banks loan defaulters,etc.

2) Logistic Regression - It’s a powerful way of modeling a binomial outcome with one or more explanatory variables and can be used for Predicting the Customer Churn, Credit Scoring & Fraud Detection, Measuring the effectiveness of marketing campaigns, etc. ,

3) Support Vector Machines - It’s a supervised machine learning technique that is widely used in pattern recognition and classification problems and can be used for detecting persons with common diseases such as diabetes, hand-written character recognition, text categorization, etc. ,

4)Random Forest: It’s an ensemble of decision trees and can solve both regression and classification problems with large data sets and used in applications such as Predict patients for high risks, Predict parts failures in manufacturing, Predict loan defaulters, etc.

Hence based on your need and size of your dataset , you can use the algorithm that is best for your application or problem.
You can read the full article by Sandeep Raut at http://www.datasciencecentral.com/profiles/blogs/want-to-know-how-to-choose-machine-learning-algorithm

3639 Hits

0 Comments

Tags:

machine-learning algorithms Analytics Tips and Trick data science

Winning Data Strategy using Industrialized Machine Learning

Friday, 27 January 2017

Ashish Pahwa

Analytics Market Intelligence

The first block to build a winning business strategy is to create a map based on business value of the question and approximating how much time would it take to get high quality answers to that question. The idea is to break the business questions into groups that corresponds to real time data systems. It allows you to focus on a specific system at once to build a strong strategy and optimize the sequence in which each sub question needs to be answered depending upon its current business value. A pattern of actions for data strategy begins with a hypothesis and collection of relevant data followed by building models to explain the data and evaluating its credibility for future predictions. The entire process is achieved on an enterprise scale digital infrastructure using Industrialized Machine Learning (IML). This approach can have a huge impact on natural resources and healthcare industries as well.

4295 Hits

0 Comments

Tags:

real time data systems data strategy Industrialized Machine Learning business strategy business value

A Neural Network Approach To Raise Your E-Book Business

Friday, 27 January 2017

Ashish Pahwa

Technology

E-Book business communities generate a lot of revenue everyday but sometimes it is difficult for author(s) to earn decent amount because of lack of preparation and research. No matter how unique and interesting your content is, if it doesn't appear on the first or second page of search results, it's highly unlikely that a visitor would ever read it. The story doesn't end here, one must cleverly select the title and cover which attract the reader as it changes the way we think. A neural network approach for the determination of most titles using Doc2Vec can be adopted to increase revenue. It involves training a thin two-layer neural network, which operates in unsupervised mode and form clusters of most similar words (using cosine similarity metric) based on context.

To read more about the technical implications here: http://www.datasciencecentral.com/profiles/blogs/use-neural-networks-to-find-the-best-words-to-title-your-ebook

2954 Hits

0 Comments

Tags:

Neural Network unsupervised learning Doc2Vec Ebook business

Automatic Debt Management System

Friday, 20 January 2017

Ashish Pahwa

Analytics

Big Data Analytics and Business Intelligence is changing the way business interacts with customers. Modern big data solutions have enabled automated decision making in debt management systems for client handling processes. Correct implementation of these tools provides a more personalized experience to each customer and avoid infringements. Debt management automation has been proven a successful solution to maintain balance between meticulous efficiency and customer satisfaction. Such a CRM automates a lot of process and thus it requires a small team days to complete debt collection process. Analytics have not just accelerated debt collection, but also enhanced customer relations.

3509 Hits

0 Comments

Tags:

business intelligence Debt Management Automatic Debt Management System client handling processes debt collection process

Essence of Qualitative Research

Friday, 20 January 2017

Ashish Pahwa

Analytics Market Intelligence

Global markets are becoming more complex each day, and therefore, it has become essential for business intelligence teams to apply advanced methods for data interpretation. They believe that only the decisions based on quantitative data can be justified. Although there are some ways quantitative research may go wrong, the truth comes out only when you meet people, talk to them, involve them in creative exercises.

4331 Hits

0 Comments

Tags:

Quantitave Research Qualitative Research business intelligence

Importance of Data Preparation

Thursday, 19 January 2017

Vasudev Singh

Analytics

Data is the backbone of analytics and machine learning and hence one of the most important tasks in analytics is to get the right kind of data and in the required format.The importance of data can be understood by the fact that around 60 to 80 percent of the time of an analyst is spent in preparing the data.
What exactly is data preparation? In a nutshell, it is the process of collecting, cleaning, processing and consolidating the data for use in analysis. It enriches the data, transforms it and improves the accuracy of the outcome.
How is it done? It is mostly done through analytics or traditional extract, transform and load (ETL) tools. ETL tools include self-service data preparation tools, data cleansing and manipulation tools, etc.
Since data is the foundation of the analytics, right data will helps in analysing the situation better and help organizations in reacting positively to the market shifts.
To know more read the full article by Ashish Sukhadeve (business analytics professional) at: http://www.datasciencecentral.com/profiles/blogs/why-data-preparation-should-not-be-overlooked

3349 Hits

0 Comments

Tags:

"Data Science Analytics Tips and Trick data Data Preparation "

Big Data Integration for Advanced Analytics

Thursday, 19 January 2017

Ashish Pahwa

Analytics

Modern needs of Big data consumption require data integration before data actually hit the business intelligence tools. This includes leveraging complex and unstructured data and enables raw data to flow securely through business. Today, even the smallest companies produce huge amount of data across systems which need to communicate with each other and therefore requires a platform to pipe all these data sources into Data Lakes.

3861 Hits

0 Comments

Tags:

Big data consumption Data Integration business intelligence Advanced Analytics Big Data

Building Consumer Intelligence System

Thursday, 19 January 2017

Ashish Pahwa

Analytics Market Intelligence

It has been evident that a great customer experience is one of the signs of a healthy business model. Machine Learning and Data Analytics are playing a fundamental role in building consumer intelligence systems. It is important to capture data and there is no single magic source to collect data. Telecoms are making billions by selling data. You need to ensure that the data is relevant to business. Once you have the right data, you are ready to model, design and engineer and deploy your 360-degree customer view platform and achieve the enhance customer experience for your organization.

4759 Hits

0 Comments

Tags:

business model machine-learning customer experience consumer intelligence system Data analytics

SigmaWay Blog

The Art of Predictive Modelling

Data Value Chain for GeoSpatial Data

Data Science Challenges in Production Environment

Rise of Data Science Platforms

Deploying Machine Learning On Real Time Systems

Enhancing Artificial Intelligence using Ensemble Training

Effective Quality Management using Hypothesis Test

Hadoop Architecture for Big Data Analytics

Good Statistical Practice

Scaling Data Models in Production Environment

Recommenders : The Future of E-commerce

2016: The year of Deep Learning

A Guide to Choosing Machine Learning Algorithms

Winning Data Strategy using Industrialized Machine Learning

A Neural Network Approach To Raise Your E-Book Business

Automatic Debt Management System

Essence of Qualitative Research

Importance of Data Preparation

Big Data Integration for Advanced Analytics

Building Consumer Intelligence System

About Sigmaway

Our Services

Other

Contacts