Advanced Methods in Data Science and Big Data Analytics


This course builds on skills developed in the Data Science and Big Data Analytics course. The main focus areas cover Hadoop (including Pig, Hive, and HBase), Natural Language Processing, Social Network Analysis, Simulation, Random Forests, Multinomial Logistic Regression, and Data Visualization. Taking an “Open” or technology-neutral approach, this course utilizes several open-source tools to address big data challenges. This training prepares the learner for Dell Technologies Proven Professional advanced analytics specialist-level certification exam (E20-065).

Upon successful completion of this course, participants should be able to:

  • Develop and execute MapReduce functionality
  • Gain familiarity with NoSQL databases and Hadoop Ecosystem tools for
    analyzing large-scale, unstructured data sets
  • Develop a working knowledge of Natural Language Processing, Social
    Network Analysis, and Data Visualization concepts
  • Use advanced quantitative methods and apply one of them in a Hadoop
  • Apply advanced techniques to real-world datasets in a final lab
2 250 EUR

2 723 EUR including VAT

The earliest date from 04.03.2024

Selection of dates
Do you have a question?
+420 731 175 867

and certified lecturers

recognized certifications

Wide range of technical
and soft skills courses

Great customer

Making courses
exactly to measure your needs

Course dates

Starting date: 04.03.2024

Type: Virtual

Course duration: 5 days

Language: en

Price without VAT: 2 250 EUR


Starting date: Upon request

Type: Virtual

Course duration: 5 days

Language: en

Price without VAT: 2 250 EUR


Type Course
Language Price without VAT
04.03.2024 Virtual 5 days en 2 250 EUR Register
Upon request Virtual 5 days en 2 250 EUR Register
G Guaranteed course

Didn't find a suitable date?

Write to us about listing an alternative tailor-made date.


Target group

This course is intended for aspiring Data Scientists, data analysts that have completed the associate level Data Science and Big Data Analytics course, and computer scientists wanting to learn MapReduce and methods for analyzing unstructured data such as text.

Course structure

The content of this course is designed to support the course objectives.

Module 1: MapReduce and Hadoop

  • Lesson 1: The MapReduce Framework
  • Lesson 2: Apache Hadoop
  • Lesson 3: Hadoop Distributed File System
  • Lesson 4: YARN

Module 2: Hadoop Ecosystem and NoSQL

  • Lesson 1: Hadoop Ecosystem
  • Lesson 2: Pig
  • Lesson 3: Hive
  • Lesson 4: NoSQL – Not Only SQL
  • Lesson 5: HBase
  • Lesson 6: Spark

Module 3: Natural Language Processing

  • Lesson 1: Introduction to NLP
  • Lesson 2: Text Preprocessing
  • Lesson 3: TFIDF
  • Lesson 4: Beyond Bag of Words
  • Lesson 5: Language Modeling
  • Lesson 6: POS Tagging and HMM
  • Lesson 7: Sentiment Analysis and Topic Modeling

    Module 4: Social Network Analysis

  • Lesson 1: Introduction to SNA and Graph Theory
  • Lesson 2: Most Important Nodes
  • Lesson 3: Communities and Small World
  • Lesson 4: Network Problems and SNA Tools

    Module 5: Data Science Theory and Methods

  • Lesson 1: Simulation
  • Lesson 2: Random Forests

Lesson 3: Multinomial Logistic Regression

Module 6: Data Visualization

  • Lesson 1: Perception and Visualization
  • Lesson 2: Visualization of Multivariate Data Module In addition t
  • Lecture and demonstrations, the classroom options include handson lab exercises designed to allow practical experience for the participant. The
    on-demand course provides recordings of the lab exercises.


Completion of the Data Science and Big Data Analytics course • Proficiency in at least one programming language such as Java or Python

Do you need advice or a tailor-made course?


product support

ComGate payment gateway MasterCard Logo Visa logo