Online

Advanced Methods in Data Science and Big Data Analytics

This course builds on skills developed in the Data Science and Big Data Analytics course. 

Pumpedu
Poskytovatel kurzu
40 hodin
Začátečník
English
IT & Programování
51 360 Kč
Pumpedu
Přejděte na web

Přehled kurzu

Délka
40 hodin
Úroveň
Začátečník
Formát
Online
Jazyk
Angličtina
Cena
od 51 360 Kč

O kurzu

Tyto autorizované kurzy jsou dostupné pouze v anglickém jazyce, proto i popis školení není přeložen.



Školení dodává autorizovaný distributor DNS a.s.



The main focus areas cover Hadoop (including Pig, Hive, and HBase), Natural Language Processing, Social Network Analysis, Simulation, Random Forests, Multinomial Logistic Regression, and Data Visualization.

Taking an "Open" or technology-neutral approach, this course utilizes several open-source tools to address big data challenges.

Co se naučíte

Upon successful completion of this course, participants should be able to:
Develop and execute MapReduce functionality
Gain familiarity with NoSQL databases and Hadoop Ecosystem tools for analyzing large-scale, unstructured data sets
Develop a working knowledge of Natural Language Processing, Social Network Analysis, and Data Visualization concepts
Use advanced quantitative methods and apply one of them in a Hadoop environment
Apply advanced techniques to real-world datasets in a final lab

Požadavky

  • Completion of the Data Science and Big Data Analytics course
  • Proficiency in at least one programming language such as Java or Python

Osnova kurzu

1

Lesson 1: The MapReduce Framework

  • Lesson 2: Apache Hadoop
  • Lesson 3: Hadoop Distributed File System
  • Lesson 4: YARN
  • Lesson 1: Hadoop Ecosystem
  • Lesson 2: Pig
  • Lesson 3: Hive
  • Lesson 4: NoSQL - Not Only SQL
  • Lesson 5: HBase
  • Lesson 6: Spark
  • Lesson 1: Introduction to NLP
  • Lesson 2: Text Preprocessing
  • Lesson 3: TFIDF
  • Lesson 4: Beyond Bag of Words
  • Lesson 5: Language Modeling
  • Lesson 6: POS Tagging and HMM
  • Lesson 7: Sentiment Analysis and Topic Modeling
  • Lesson 1: Introduction to SNA and Graph Theory
  • Lesson 2: Most Important Nodes
  • Lesson 3: Communities and Small World
  • Lesson 4: Network Problems and SNA Tools
  • Lesson 1: Simulation
  • Lesson 2: Random Forests
  • Lesson 3: Multinomial Logistic Regression
  • Lesson 1: Perception and Visualization
  • Lesson 2: Visualization of Multivariate Data Module

Certifikace

This training prepares the learner for Dell Technologies Proven Professional advanced analytics specialist-level certification exam (E20-065).

Pro koho je kurz vhodný?

  • This course is intended for aspiring Data Scientists, data analysts that have completed the associate level Data Science and Big Data Analytics course, and computer scientists wanting to learn MapReduce and methods for analyzing unstructured data such as text.

Důležité informace

Materiály
Materials are in electronic form from Dell Technologies.
Kód kurzu
PU23220056
Pumpedu
Kurz nabízí

Pumpedu

www.pumpedu.cz
Navštívit web