100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

COS4861 Assignment 2 2025 (COMPLETE ANSWERS)

Rating
-
Sold
-
Pages
19
Grade
A+
Uploaded on
10-07-2025
Written in
2024/2025

Natural Language Processing - COS4861 Assignment 2 2025; 100 % TRUSTED workings, Expert Solved, Explanations and Solutions. For assistance call or W.h.a.t.s.a.p.p us on ...(.+.2.5.4.7.7.9.5.4.0.1.3.2)........... Assignment 2: Natural Language Processing (NLP) Total Marks: 100 Primary Lecturer: Dr M.Sibiya Question 1: Theory of Automata (40 marks) Topic: Deterministic Finite State Automata (DFSA) and Non-Deterministic Finite State Automata (NDFSA) 1.1 Define a Deterministic Finite State Automata (DFSA). Explain its key components and how it operates with an example. (10 marks) 1.2 Define a Non-Deterministic Finite State Automata (NDFSA). Explain its key components and how it differs from DFSA with an example. (10 marks) 1.3 Prove that for every NDFSA, there exists an equivalent DFSA that recognizes the same language. Provide a step-by-step example where you convert a given NDFSA into a DFSA. (15 marks) 1.4 Discuss the significance of DFSA and NDFSA in the context of Natural Language Processing (NLP). Provide examples of how these automata models might be used in NLP tasks. (5 marks) Question 2: Practical Project on NLP Data Preprocessing (60 marks) Topic: Data Preprocessing Techniques in NLP 2.1 Create a block diagram to represent the workflow of an NLP data preprocessing pipeline. The pipeline should include the following stages: Tokenization, Stopwords Removal, Stemming, and Lemmatization. (10 marks) 2.2 Select a small text dataset (e.g., a paragraph or set of sentences). Apply each of the following preprocessing techniques on this dataset: • Tokenization: Split the text into tokens (words). • Stopwords Removal: Remove common stopwords from the tokenized text. • Stemming: Apply a stemming algorithm (e.g., Porter Stemmer) to reduce words to their base forms. • Lemmatization: Apply lemmatization to reduce words to their dictionary form. (25 marks) 2.3 Explain each step of the preprocessing techniques applied in 2.2. Provide insights into the significance of each technique and how it impacts the final dataset. (15 marks) 2.4 Conclusion: Write a conclusion summarizing the effects of the preprocessing steps on the text data. Discuss how these techniques help in improving the performance of NLP models. (10 marks)

Show more Read less
Institution
Module









Whoops! We can’t load your doc right now. Try again or contact support.

Connected book

Written for

Institution
Module

Document information

Uploaded on
July 10, 2025
Number of pages
19
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

COS4861
ASSIGNMENT 2 2025

UNIQUE NO.
DUE DATE: 2025

,Natural Language Processing

COS4861 Assignment 2 2025

Question 1: Theory of Automata (40 marks)

1.1 Define a Deterministic Finite State Automata (DFSA). (10 marks)

A Deterministic Finite State Automaton (DFSA) is a mathematical model used to
recognize patterns within input strings and determine their membership in a particular
language. A DFSA is defined as a 5-tuple (Q, Σ, δ, q₀, F) where:

 Q: A finite set of states
 Σ: A finite input alphabet
 δ: A transition function δ: Q × Σ → Q
 q₀: The start state (q₀ ∈ Q)
 F: A set of accept (final) states (F ⊆ Q)

Operation: DFSA processes an input string one symbol at a time, transitioning
deterministically from one state to another as defined by δ.

Example: Q = {q0, q1}, Σ = {0,1}, q0 is the start state, F = {q1}, δ defined as:

 δ(q0, 0) = q0
 δ(q0, 1) = q1
 δ(q1, 0) = q0
 δ(q1, 1) = q1

This DFSA accepts strings that end with '1'.




1.2 Define a Non-Deterministic Finite State Automata (NDFSA). (10 marks)

, An NDFSA is similar to a DFSA but allows multiple transitions for the same input symbol
and even ε-transitions (transitions without consuming input).

An NDFSA is also a 5-tuple (Q, Σ, δ, q₀, F) but with δ: Q × (Σ ∪ {ε}) → 2^Q.

Differences from DFSA:

 Multiple transitions per symbol allowed.
 Acceptance if any computation path leads to a final state.

Example: Q = {q0, q1}, Σ = {0,1}, q0 is the start, F = {q1}, δ:

 δ(q0, 0) = {q0, q1}
 δ(q0, 1) = {q0}
 δ(q1, 0) = ∅
 δ(q1, 1) = {q1}

Accepts strings with at least one 0.




1.3 Prove every NDFSA has an equivalent DFSA. (15 marks)

Use subset construction:

Let NDFSA N:

 Q = {A, B}, Σ = {0,1}, q0 = A, F = {B}, δ:
o δ(A,0) = {A,B}, δ(A,1) = {A}
o δ(B,0) = ∅, δ(B,1) = {B}

Construct DFSA:

 Start state: {A}
 On 0: δ({A},0) = {A,B}

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
LIBRARYpro University of South Africa (Unisa)
Follow You need to be logged in order to follow users or courses
Sold
10519
Member since
2 year
Number of followers
4904
Documents
4814
Last sold
16 hours ago
LIBRARY

On this page, you find all documents, Package Deals, and Flashcards offered by seller LIBRARYpro (LIBRARY). Knowledge is Power. #You already got my attention!

3.7

1457 reviews

5
683
4
235
3
243
2
78
1
218

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these revision notes.

Didn't get what you expected? Choose another document

No problem! You can straightaway pick a different document that better suits what you're after.

Pay as you like, start learning straight away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and smashed it. It really can be that simple.”

Alisha Student

Frequently asked questions