+44 1923526021 +91 989 345 2420

Data Science with R

Data Science with R

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, structured and unstructured, similar to data mining. We are Data Science SMEs. We provide Data Science solutions & Training to streamline the business information to predict and achieve the best outcomes. This course focus to develop Smart Data Science professionals with powerful tools like R program, Tableau, SQL and hadoop that can understand the Data from its inception and able them to perform data warehousing, Predictive, statistical analysis and visualizations. This course provide deep understanding of Data Discovery, Data analysis, Data interpretation, Data modeling ,Business Intelligence, Machine Learning , Predictive analysis and Data visualizations using the wide range of Tools, Technologies and Methodologies.
Our Business Intelligence and Data Warehousing Training offerings enlighten the students to understand BI technology initiatives with Data Science strategies and vision. This course does not require any Technical background and can be useful for those who want to switch to the data science field.

We Offer Comprehensive set of Skills from initial concept down to the Final Outcome.

0%
Machine learning with R
0%
Data visualizations with tableau and R
0%
ETL with SQL and Hadoop

Course Content

Unit 1 : R Programing


The Introduction to R chapter will cover everything basic, starting from installing Rstudio to writing basic syntax and code in Rstudio. The history of R, introduction to R workspace and working environment and the various operators used in R are briefly discussed in this introduction part.
This chapter briefly discuss the types of data structure that R accepts as for the data input and to perform various functionalities on the data. The various operations that we can perform using these data structures is also covered in this chapter.
Here, you will be learning about how to load and read different types of data in R working environment. Also the chapter will provide a brief about how we can perform different functions and analysis on different types of data.
In this chapter the various functions that are very frequently used and which are essential for analysis of data in R, are discussed briefly. Also the utility of these functions with practical examples is covered in this chapter.
The working with text and date chapter includes the utility of various date functions and usage of packages that supports functionality for Date such as lubridate package in R. Working with text includes usage of text functions and understanding their functionality with practical examples.
The Descriptive Statistics chapter will cover the main elements of descriptive statistics, i.e., working with mean, median, variance, quantiles etc. in R with their practical implementation in Rstudio.
This Chapter will cover an overview of the statistical tests and methods on the various datasets that are available in Rstudio. The statistical tests that will be covered under this section includes application of Z-test, T-test, utility of normal distributed data etc.
In this chapter we will be covering the basics and some of the advanced methods that R supports and which helps the user in cleaning the data efficiently. The chapter also covers the importance of data generated after the application of these cleaning method on the raw data. The cleaning methods will cover data replacement and data elimination methods in R.
The data visualization with R will cover the practical implementation and visuals of the plots and charts created through ggplot2 package and also it covers an overview of the esiquisse package that supports the drag & drop functionality on data in R.
The chapter covers an overview on the various statistical models that can be applied on the given dataset in R. The statistical models covers an introduction on classification and regression models and their practical implementation using a given dataset in Rstudio.
The supervised learning with R will include the classification methods on data with a detailed implementation and explanation of the regression models such as Random Forest, Logistic Regression, Rpart etc. in Rstudio.
The unsupervised methods in R covers the application and utility of the unsupervised methods of classification and regression such as k-means, PCA etc. with examples performed on the dataset.
The Recommender system includes working on a practical example that builds a rating wise recommender system. The chapter will cover the algorithm of the Recommender system and its model building in R.
This chapter briefly describes the Machine Learning algorithm that will work on a real life data. The applied algorithm will work with dynamic data and the results showing thereof in R.
The chapter on deploying the script to production will cover the practical demonstration on how the codes and syntax written in script can be deployed and made work on the other datasets that are provided as input to the R script function.

Unit 2 : Big data


Under this topic we will learn and explore about Big Data and Hadoop with the importance of Hadoop. Also the challenges that you face in Big Data. The Fundamental design principles and the technologies used in Hadoop will also be covered here. You will learn about the architecture, RDBMS and use cases of Hadoop.
In this section you will learn about the Hadoop Cluster and the Architecture of Hadoop cluster. Here you will get to learn about the workflow in cluster. The way of reading files from HDFS and writing files to HDFS will be covered under this section.
In this section we will get the basic idea of Map Reduce and its necessity. Also you will get the idea about the terms mapper , reducer and shuffling. The concept of usage of map reduce and its flow will be covered under this section.
In this section you will learn about the terminology like Combiner, Partitioner and Counter. You will get the basic idea about the Input formats and Output formats. Also the concept of mapping and reducing join using MR. Also learn about the configuration of Hadoop.
In this section you will get an idea about the challenges or drawback of using Hadoop 1.0. Then you will learn about the new features added in Hadoop 2.0. Also its Cluster Architecture and Federation. The concept of Yarn, Hadoop Ecosystem and the Yarn MR Application flow will be covered in this section.
In this section you will learn about basic concept of Pig, its features, its use cases. Also you will learn how to interact with Pig. It will also cover the idea of Basic data analysis and the Latin syntax of Pig. Also you will how data is loaded , its data types ,field definitions and the output of data.
In this section you will learn about view of schema, how to filter and sorting data. The commonly used functions, processing complex data . It will cover the concept of grouping data and also techniques of combining data sets. Also you will get idea about how to join data sets and split data sets in Pig.
In this section you will explore about the fundamentals, architecture of Hive and how to load data and apply query on data in Hive. It will also cover the data types, operators and functions of HiveQL. You will also learn about hive tables, managed tables and external table. Also about Storage formats, Importing data , Altering tables and dropping table. It will also cover about data query, sorting and aggregating data and scripts of Map Reduce.
Under this section you will learn about CAP Theorem. Also you will learn about the concept and architecture of HBase. Also you will learn about Client API’s and its features. The section will also cover the data models and operations over it.
In this section you will learn about the basic concept of Sqoop. You will also explore about connecting relational database using Sqoop. You will also learn about importing and exporting data from and to MySQL respectively. Then also how to further connect MySQL data to hive and hbase. You will also learn about using queries in Sqoop.
In this section you will learn the basic concept of Flume, its importance, architecture and configuration. Then you will learn about the concept of Oozie, its architecture and its configuration. It will also cover about the properties of Oozie and Job Submission in Oozie.
In this section you will learn about you will learn about the basics of Apache Spark, its importance and benefits. It will also cover the overview of Batch Analytics in Hadoop Ecosystem and also real time analytics options. It will give an idea about how Spark is beneficial to professionals. Under this section you will also learn about Spark components and Executive architecture of Spark.
In this section you will learn about features of Scala. Also you will explore about Basic data types, Objects, Classes, List, and Maps in Scala. You will also learn about usage of function as object and usage of Anonymous Functions, along with higher order functions in Scala. You will also learn about Pattern matching, Traits and Collections in Scala.
In this section you will learn about Spark, its Application and its deployment. You will also get idea about Distributed Systems. You will also get to know about Spark for scalable system and its execution context. You will also get to learn about Parallelism and Caching in Spark. The section will also cover basic idea about RDD, its deep dive, dependencies and lineage.
In this section you will learn about Transformations, Actions, Clusters also about the Data Frames in Spark. Also you will learn about the basic introduction of SQL. You will also explore about Spark SQL with CSV, JSON and Database.
In this section you will learn about Features of Spark Streaming and its use case. Also you will learn about Dstreams and about Transformations on it. Also you will explore about Hadoop and its importance. In this you will learn about Distributed Computation and Functional Programming. Also you will get to know about MapReduce Framework and its job run.

Unit 3 : SQL


• Introduction to T-SQL
• What is SQL Server
• What is SSMS
• What is Database
• Datatypes in SQL
• DDL Commands (Create, Alter, Drop)
• DML Commands (Select, Insert, Update, Delete, Truncate)
• Clauses ( Order by, Where By, Group By, Having)
• Aggregate Function
• Distinct, Null Values
• Operators (Arithmetic, logical, Conditional)
 o Arithmetic (+,-,/,*,++,--)
 o Logical Operator (<,>,<=,>=,!=,==)
 o Conditional (and, or)
 o In, not in, is null, between

• Alias Name for column and table name
• Introduction to Views
• Inner Join
• Left Join
• Right Join
• Full Outer Join
• Cross Join
• Self Join
• Introduction to SP
• Create Parameterized SP
• Execution SP
• Introduction to Trigger
• Creating trigger
• Creating cursor
• Use of trigger and cursor
• Introduction to Functions
• Types of funtions
o User Define function
 Inline table value function  Multiline table value function  Scalar funtion
o Built in function
 Date Funtion  String Function  Row Number  Rank, Dense Rank  Colease, is null
• Case when
• Union All

Unit 4 : Tableau


Under this section we will cover the basics of Tableau. Also it will give the insights of the importance and beneficial measures about working with Tableau. With this you will be able to clarify the quest about learning Tableau.
Under this we will learn how to connect or import data from different data sources. Also you will learn how different format of data can be imported to tableau.
Under this section you will cover about how you can clean data with the help of tableau. In general data cleaning is not done in tableau in vast area but in this we can clean data in minor level. In this you will learn the usage and working of data interpreter tool.
Under this you will learn about the actual workspace where you we will be working on the data set. You will be learning about how the different features will be useful for showing the different insights of data which will make data in much interactive manner.
Under this you will learn how to make data much more concise. This will make the data much more optimized which will help in depicting data in much more meaningful way. Mostly the data is prepared by removing the unwanted and noisy values from the data.
In this section you will learn about how to show data through different appropriate graph according to the data. Different formatting methods can be applied in graph to make the data in graphs appropriately descriptive. Different formatting filters will help in visualizing data in a understanding way.
In this section you will learn about the different filters in detail. By applying these filters as and when required on the graph will make the graphs much more informative. Also you will learn about how to edit the names of the data present on the row. Also we can change the diameters by ranging the values rather than the single value.
Under this section you will learn about creating additional fields. This is done when you need to analyse something different that is not given directly in the data source. You will learn in detail by creating new field with the help of existing fields simply or by applying conditions on them.
Under this particular section you will explore more filter tools which will make data more exploratory.
In this topic you will get to learn about framing the dashboard, which is basically concatenation of different graphs. In dashboard you bring different sheets depicting relative information on one sheet. Also you can add filter to it so if you change anywhere on dashboard it will reflect accordingly and vice versa.
Under this topic you will explore about how you can share your tableau project to client. This is one of the vital step in tableau course. With the help of publishing function you can create a simple link which can be then shared with the remote users to make them access the project that you made.
In this section you will learn about how you can create a full explanatory storyline. This is basically the sequence of different graph. Addition to this along with every sheet you can write caption or a short story guide with every sheet which will give the guideline narrating a sort of story. This will help you in explaining the data more concisely and optimally.

Classroom
Australia

Coming soon

Classroom
UAE

Coming soon

Classroom
India

Coming soon

Fee structure for
Australia

“Kindly send your queries to info@academyofdatascience.com”

Fee structure for
UAE

“Kindly send your queries to info@academyofdatascience.com”

Fee structure for
India

“Kindly send your queries to info@academyofdatascience.com”
0

Hours
Training

0

Classroom
Batch Available

0

Online
Batches Runninng

0

Hours
of Live Projects

What we do

Other Training Programmes

From The Blog

Anjali Agrawal

Animated Visualization Of UK Accidents Data