Required Exams
- DS700 – Descriptive and
Inferential Statistics on Big Data
- DS701 – Advanced Analytical
Techniques on Big Data
- DS702 - Machine Learning at
Scale
Each exam
may be taken in any order. All three exams must be passed within 365 days of
each other. Candidates who fail an exam must wait a period of thirty calendar
days, beginning the day after the failed attempt, before they may retake the
same exam. Candidates must pay for each exam attempt.
Each passed exam is verifiable in your exam transcript and history.
Each passed exam is verifiable in your exam transcript and history.
Exam Format
Each exam
is a single challenge scenario. You are provided access to the scenario, the
data sets, and the cluster. You are given eight (8) hours to complete the
challenge. See below for more information on the cluster.
Required Skills
Common Skills (all exams)
- Extract relevant features from
a large dataset that may contain bad records, partial records, errors, or
other forms of “noise”
- Extract features from a data
stored in a wide range of possible formats, including JSON, XML, raw text
logs, industry-specific encodings, and graph link data
DS700 - Descriptive and Inferential Statistics
on Big Data
- Use statistical tests to
determine confidence for a hypothesis
- Calculate common summary
statistics, such as mean, variance, and counts
- Fit a distribution to a dataset
and use that distribution to predict event likelihoods
- Perform complex statistical
calculations on a large dataset
DS701 - Advanced Analytical Techniques on Big
Data
- Build a model that contains
relevant features from a large dataset
- Define relevant data groupings,
including number, size, and characteristics
- Assign data records from a
large dataset into a defined set of data groupings
- Evaluate goodness of fit for a
given set of data groupings and a dataset
- Apply advanced analytical
techniques, such as network graph analysis or outlier detection
DS702 - Machine Learning at Scale
- Build a model that contains
relevant features from a large dataset
- Predict labels for an unlabeled
dataset using a labeled dataset for reference
- Select a classification
algorithm that is appropriate for the given dataset
- Tune algorithm metaparameters
to maximize algorithm performance
- Use validation techniques to
determine the successfulness of a given algorithm for the given dataset
Exam Delivery and Cluster Information
All CCP:
Data Scientist exams are remote-proctored and available anywhere, anytime. See
the FAQ for more information and system
requirements.
Exams are
hands-on, practical exams using data science tools on Cloudera technologies.
Each user is given their own 7-node, high-performance CDH5 (currently 5.3.2)
cluster pre-loaded with Spark, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume,
Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also
comes with Python (2.6 and 3.4), Perl 5.10, Elephant Bird, Cascading 2.6,
Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, NetBeans,
scikit-learn, octave, NumPy, SciPy, Anaconda, R, plyr, dplyrimpaladb, SparkML,
vowpal wabbit, clouderML, oryx, impyla, CoreNLP, The Stanford Parser: A
statistical parser, Stanford Log-linear Part-Of-Speech Tagger, Stanford Named
Entity Recognizer (NER), Stanford Word Segmenter, opennlp, H2O, java-ml,
RapidMiner, caffe, Weka, NLTK, matplotlib, ggplot, d3py, SparkingPandas,
randomforest, R: ggplot2, Sparkling water. The cluster is open and candidates
are allowed to install any tool they wish during the exam window.
Currently,
the cluster is open to the internet and there are no restrictions on tools you
can install or websites or resources you may use.
CCP:DS
Solution Kit
Creative Articles!!!The Brief Explanations about Java are really Fantastic...Gained a lots of idea and knowledge from your Nice works.
ReplyDeleteJava training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery
Creative Articles!!!The Brief Explanations about Java are really Fantastic...Gained a lots of idea from your Wonderful works.Keep doing the Same.Software Testing Training in Chennai
ReplyDeleteSoftware Testing Training in Velachery
Software Testing Training in Tambaram
Software Testing Training in Porur
Software Testing Training in Omr
Software Testing Training in Annanagar