Study guides, Class notes & Summaries

Looking for the best study guides, study notes and summaries about ? On this page you'll find 9 study documents about .

All 9 results

Sort by

Hands-on Exercise Ex5-3: Detecting Fake News with Apache Spark and Spark NLP
  • Hands-on Exercise Ex5-3: Detecting Fake News with Apache Spark and Spark NLP

  • Exam (elaborations) • 13 pages • 2024
  • Hands-on Exercise Ex5-3: Detecting Fake News with Apache Spark and Spark NLP Assignment 1 – 4 (10pts each, 40pts in total) Do the exercises in Section 1.4 – 1.7 Assignment 5 (30pts) Rewrite the codes for detecting fake/real news in Trump and Biden tweet datasets. Note: Do not combine those datasets. • Read the article [21] • (10pts) Write the codes for downloading the two files: o Use the two links in the article o Use the links from the raw data by clicking the raw button on th...
    (0)
  • $10.49
  • + learn more
Hands-on Exercise Ex5-2: Topic modeling with Apache Spark and Spark NLP
  • Hands-on Exercise Ex5-2: Topic modeling with Apache Spark and Spark NLP

  • Exam (elaborations) • 16 pages • 2024
  • Hands-on Exercise Ex5-2: Topic modeling with Apache Spark and Spark NLP Assignments 1 – 4 (10pts each) Do the exercises in Sections 3.6 – 3.9 Assignment 5 (20pts) Try different values of k and maxIter to see which combination best suits your data in Section 3.8. Show at least five combinations, show their results, and explain why it’s best. Assignment 6 (40pts) (30pts) Rewrite the codes for finding topics in the tweets coronavirus dataset. (10pts) Also, try different values of k an...
    (0)
  • $10.49
  • + learn more
Hands-on Exercise Ex5-1: Natural Language Processing (NLP) with Named Entity Recognition (NER)
  • Hands-on Exercise Ex5-1: Natural Language Processing (NLP) with Named Entity Recognition (NER)

  • Exam (elaborations) • 8 pages • 2024
  • Hands-on Exercise Ex5-1: Natural Language Processing (NLP) with Named Entity Recognition (NER) Assignment 10 (10pts) Annotate (NER) a text using a PretrainedPipeline (recognize_entities_dl) in SparkNLP [12][13] • Input Text from Wikipedia The University of Illinois Springfield (UIS) is a public university in Springfield, Illinois, United States. The university was established in 1969 as Sangamon State University by the Illinois General Assembly and became a part of the University of Ill...
    (0)
  • $10.49
  • + learn more
Learn Models using ML Pipeline in Spark.
  • Learn Models using ML Pipeline in Spark.

  • Exam (elaborations) • 3 pages • 2024
  • Learn Models using ML Pipeline in Spark. 2.2.1.2 Specify parameters The next step is setting up parameters for ML algorithms, LogisticRegression. We give 10 for maxIter (Max Iteration) and 0.01 for regParam (Regularization parameter) For detail, see reference [7] After running the above codes in Spark shell, you will see a bunch of parameters you specified, e.g. maxIter and regParam, and can specify or change, aggregationDepth and etc. 2.2.1.3 Learn model Now it’s time to learn mode wi...
    (0)
  • $10.49
  • + learn more
Data Analytics using Spark SQL
  • Data Analytics using Spark SQL

  • Exam (elaborations) • 2 pages • 2024
  • Data Analytics using Spark SQL Assignment1 (20pts) Related: Section 3 Write and run a Spark command (not SQL query) to show the date when # of deaths was severe (more than 800 deaths), as well as # of confirmed cases, # of deaths, and country using the filter function. The output should be like the one below. +--------+-----+------+-----------------------+ | dateRep|cases|deaths|countriesAndTerritories| +--------+-----+------+-----------------------+ Note: Write commands/queries for all ...
    (0)
  • $10.49
  • + learn more
Data Analytics with DW/OLAP using Hive
  • Data Analytics with DW/OLAP using Hive

  • Exam (elaborations) • 6 pages • 2024
  • Create Hive Tables Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data queries and analysis. This exercise will use Hive as a data warehouse/OLAP tool for analyzing data. 2.1.3 Create Hive Tables 2.1.3.1 Check Schema To check the schema of the tables, see the first 5 rows. To see , use the ‘head’ Linux commands. You can see the schema (at least field/column names) in the first line: driverId, name, ssn, location, certified, and wage-plan....
    (0)
  • $10.49
  • + learn more
NoSQL Database HBase
  • NoSQL Database HBase

  • Exam (elaborations) • 2 pages • 2024
  • NoSQL Database HBase Assignments 1. Write and run 11 HBase commands to insert a new row into the table. a. Table name: <your-namespace>:truck_event b. Rowkey: 20000 c. Column family name: events d. Columns: values i. driverId: <your-login or UIS NetID> ii. truckId: 999 iii. eventTime: 01:01.1 iv. eventType: <Pick one from Normal, Overspeed, and Lane Departure> v. longitude: -94.58 vi. latitude: 37.03 vii. eventKey (This is a RowKey) viii. CorrelationId: 1000 ix. ...
    (0)
  • $10.49
  • + learn more
Building a Hadoop Cluster with three VMs
  • Building a Hadoop Cluster with three VMs

  • Exam (elaborations) • 6 pages • 2024
  • Building a Hadoop Cluster with three VMs Assignments 1. (55pts in total) Check whether your Hadoop cluster is running correctly or not. Explain and take screenshots. a. (15pts) Before you do it, change or show the hostnames of your three nodes. i. Hostnames 1. Master: <your-NetID>-HM 2. Worker1: <your-NetID>-W1 3. Worker2: <your-NetID>-W2 4. For example, sslee777-HM, sslee777-W1, and sslee777-W2 ii. b. (15pts) By creating your user directory See the below example (...
    (0)
  • $10.49
  • + learn more
Big_Data_Analytics_Ex2-1
  • Big_Data_Analytics_Ex2-1

  • Exam (elaborations) • 3 pages • 2024
  • Write Pig scripts for finding truck drivers exceeded the speed limit, ‘overspeed’. a. Dataset: Truck IoT dataset i. Dataset location (Linux filesystem): /home/data/CSC534BDA/datasets/Truck-IoT/ ii. Filenames: truck_event_text_ b. Write and run your Pig scripts i. (20pts) Find all truck drivers who exceeded the speed limit, ‘Overspeed’ ii. (10pts) Define schema when you load the data (Don’t use $0, $1, or etc.) iii. (10pts) Show the driver’s events grouped, if the drivers exc...
    (0)
  • $10.49
  • + learn more