Does the solution meet the goal?

Last Updated on October 21, 2021 by Admin

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

– /data/2018/Q1.csv
– /data/2018/Q2.csv
– /data/2018/Q3.csv
– /data/2018/Q4.csv
 – /data/2019/Q1.csv

All files store data in the following format:

id,f1,f2,I
1,1,2,0
2,1,1,1
3,2,1,0
4,2,2,1

You run the following code:

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 092

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 092

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 093

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 093

Solution: Run the following code:

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 094

DP-100 Designing and Implementing a Data Science Solution on Azure Part 06 Q01 094

Does the solution meet the goal?

  • Yes
  • No

Explanation:

Use two file paths.
Use Dataset.Tabular_from_delimeted, instead of Dataset.File.from_files as the data isn’t cleansed.

Note:
A File Dataset references single or multiple files in your datastores or public URLs. If your data is already cleansed, and ready to use in training experiments, you can download or mount the files to your compute as a File Dataset object.

A Tabular Dataset represents data in a tabular format by parsing the provided file or list of files. This provides you with the ability to materialize the data into a pandas or Spark Data Frame so you can work with familiar data preparation and training libraries without having to leave your notebook. You can create a Tabular Dataset object from .csv, .tsv, .parquet, .jsonl files, and from SQL query results.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments