100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

COS4861 Assignment 2 (NLP) Due 2025

Rating
-
Sold
-
Pages
10
Grade
A+
Uploaded on
13-07-2025
Written in
2024/2025

Okay, here's a polished and enhanced version of your document's description, suitable for an academic assignment: COS4861/0/2025 Assignment 2: Automata and NLP Preprocessing This document presents a comprehensive response to Assignment 2 for COS4861/0/2025, delving into both the theoretical underpinnings of automata theory and the practical application of Natural Language Processing (NLP) data preprocessing. Theoretical Foundations: Automata in NLP The theoretical section provides a rigorous exploration of Deterministic Finite State Automata (DFSA) and Non-Deterministic Finite State Automata (NDFSA). It meticulously defines their fundamental components (Q, Σ, δ, q 0 ​ , F), operations, and key distinctions using formal notation and illustrative examples. A detailed proof of the equivalence between NDFSA and DFSA is presented, employing the subset construction algorithm, complemented by a clear, step-by-step conversion example. Furthermore, the critical significance of these automata in core NLP tasks, such as tokenization and syntax parsing, is thoroughly analyzed with relevant practical applications. Practical Implementation: NLP Preprocessing Pipeline The practical component details the implementation of a robust NLP preprocessing pipeline. Using Python's NLTK library, the pipeline systematically applies essential preprocessing steps to a sample text dataset, including: Tokenization: Breaking down text into individual words or units. Stopwords Removal: Eliminating common, low-information words. Stemming: Reducing words to their root form heuristically. Lemmatization: Reducing words to their base dictionary form (lemma) using linguistic knowledge.

Show more Read less









Whoops! We can’t load your doc right now. Try again or contact support.

Document information

Uploaded on
July 13, 2025
Number of pages
10
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

COS4861

Assignment 2

Natural Language Processing

Due 2025

, COS4861/2025 Assignment 2: Natural
Language Processing (NLP)



Question 1: Theory of Automata (40 Marks)
1.1 Deterministic Finite State Automaton (DFSA)
A Deterministic Finite State Automaton (DFSA) is defined as a 5-tuple:

M = (Q, Σ, δ, q0 , F )

where:

– Q: A finite set of states

– Σ: A finite input alphabet

– δ: A transition function δ : Q × Σ → Q

– q0 ∈ Q: The start state

– F ⊆ Q: A set of accepting states

Each input symbol causes the automaton to make a unique transition to the next state.

Example: A DFSA that accepts binary strings ending in 01:

Q = {q0 , q1 , q2 }, Σ = {0, 1}, q0 = start state, F = {q2 }

δ(q0 , 0) = q1 , δ(q1 , 1) = q2

0 1
start q0 q1 q2




1

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
BeeNotes teachmetutor
View profile
Follow You need to be logged in order to follow users or courses
Sold
283
Member since
6 months
Number of followers
0
Documents
486
Last sold
1 week ago
BeeNotes

BeeNotes: Buzzing Brilliance for Your Studies Discover BeeNotes, where hard-working lecture notes fuel your academic success. Our clear, concise study materials simplify complex topics and help you ace exams. Join the hive and unlock your potential with BeeNotes today!

4,1

36 reviews

5
21
4
3
3
8
2
1
1
3

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions