Essay

Bioinformatics essay assignment

Rating

Sold

Pages

Grade

A+

Uploaded on

19-01-2025

Written in

2021/2022

Bioinformatics assignment using data interpretation and graphical analysis on the topic. Submitted for biomedical science, first class grade for final year assignment.

Institution

Module

Content preview

The use of Bioinformatic Tools to Assess the Structure and Function of an
Unidentified Coding and Non-Coding Sequence.19002410

Abstract

Nucleotide sequences 4a and 4b were interpreted with the use of bioinformatic tools, to first find their
identity using NCBI nucleotide BLAST. Aims focused on using a range of bioinformatic tools to
recall the characteristics of each sequence and their function in regulating cellular processes or role in
pathology within the body. Outputs were interpreted relative to the purpose of each tool, providing
information on the structural features of each sequence, or their role in cell signalling. HOTAIR was
revealed to be long non-coding RNA sequence 4a, found to be upregulated in several diseases within
the body. BECN1 was found to be coding sequence 4b, responsible for autophagy, and was closely
associated to other protein subunits within the PI3K complex. Other databases such as
STRING, HuRi, Alphafold, Genevisible and DisGeNET were then used to further characterise
and establish the function of each sequence. Outputs were interpreted relative to the protein-
protein interactions, splice variants, or varying levels of expression of each sequence within a range of
tissues. Prevalence of the gene within each tissue was associated with normal, or abnormal, cellular
function, playing a vital role in medical research. Information provided by
each database informed subsequent investigation into the impact each sequence had within
the Homo sapien genome.

Introduction

Bioinformatics has allowed for a large volume of biological and statistical information to be
processed, stored, and collated in the form of databases. Recall of complex datasets can provide a large
array of information to determine the identity and characteristics of sequences. Of interest was the role
of non-coding and coding nucleotide sequences that were analysed in terms of their homology to a
range of genes, where results demonstrated a variety of differences due to splice variants and their
interactions with structural proteins. Approximately 98% of genetic information can be categorised as
non-coding; demonstrating that an array of information can be explored and deduced from sequences
that do not have the required mechanisms to code for proteins (Perenthaler et al., 2019). A variety
of bioinformatic tools can collectively be used to gain further insight into genomics, whereby
sequences can be analysed to reveal their role in disease presentation, cellular function or signalling
pathways. Notably, the rise of the Human Genome Project has enabled the field of bioinformatics to
gain traction within the scientific community, as an efficient way to store information and establish a
valuable relationship between the related disciplines of mathematics and computer science (Hood,
Rowen, 2013). The availability of a range of bioinformatic tools allowed for a more objective point of
view on the role of a gene within the body, and a clearer understanding of its functional significance.
This is beneficial, as deductions were made from a greater number of resources that had a large
coverage across several databases. The aims of this project were to gain an understanding of the
functionality of two sequences within the human genome, and to depict the key features they
present that lead to their distinction from other elements of the transcriptome. Bioinformatic tools
were used as the basis of research into each sequence, whereby each search result led to further
investigation using other databases, to gain insight into the significance of both non-coding and coding
sequences 4a and 4b.

Methods

1

, A summary of the bioinformatic tools used to obtain structural and functional information
surrounding sequence 4a and 4b during this investigation are summarised in Table 1.

Table 1. List of bioinformatic tools that were used, including their purpose in the characterisation of sequences 4a and 4b. A
range of bioinformatic tools were utilised to determine the structure and function of each sequence, starting with NCBI Nucleotide
BLAST. Each sequence was identified as either coding or non-coding and named according to their homology to
predicted sequences stored within the chosen database. Outputs provided by each tool informed subsequent investigation using
other databases to explore the impact the sequences had in more detail.
Name of Bioinformatic Tool Purpose of Tool
NCBI Nucleotide BLAST To confirm the identity of each sequence compared to several predicted and
experimentally confirmed outputs
STRING To assess the interaction of sequence 4b with other co-regulatory proteins
within protein complexes
HuRI Confirm the coverage of STRING and objectiveness of outputs provided for
sequence 4b by comparing the two databases
Alphafold Visualise the structural features of sequence 4b in relation to its function
Genevisible Used to compare the level of 4a’s expression to an array of tissue types
in healthy samples and in cancer presentation
DisGeNET Compared the role of splice variants within exon and intronic regions
of sequence 4a in disease presentation, whilst determining the chromosome
number it was found on

Results

Non-coding sequence

A total of 11 sequences were recalled from NCBI nucleotide BLAST to confirm the identity of
sequence 4a to be non-coding HOX antisense intergenic RNA (HOTAIR). HOTAIR displayed
100% homology matching the query length of 2158 nucleotides. Searches were refined to genes
specific to the Homo sapien genome only, where only highly similar sequences were selected (Figure
1).

Figure 1. Output from NCBI Nucleotide BLAST using nucleotide sequence 4a (NCBI, 2021). A list of similar sequences to the
query length of sequence 4a were shown and refined to the Homo sapien genome.

Following from the identification of sequence 4a, the extent of HOTAIR expression
within healthy tissue was observed using data from Genevisible (Figure 2).

2

Report Copyright Violation

Written for

Institution: Keele University (KU)
Study: Unknown
Module: LSC-30057 bioinformatics and science communication (LSC30057)

All documents for this subject (1)

Document information

Uploaded on: January 19, 2025
Number of pages: 9
Written in: 2021/2022
Type: ESSAY
Professor(s): Unknown
Grade: A+

Subjects

biology
bioinformatics
data
interpretation
science
biomedical
neuroscience
first class
first
keele
life science

£16.49

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

elsiebelsy

Get to know the seller

elsiebelsy Keele University

View profile

Sold

Member since

1 year

Number of followers

Documents

Last sold

7 months ago

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these revision notes.

Didn't get what you expected? Choose another document

No problem! You can straightaway pick a different document that better suits what you're after.

Pay as you like, start learning straight away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and smashed it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller elsiebelsy. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for £16.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 48637 documents were sold in the last 30 days Founded in 2010, the go-to place to buy revision notes and other study material for 16 years now

Bioinformatics essay assignment

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning straight away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?

Bioinformatics essay assignment

Content preview

Written for

Document information

Subjects

More courses for Keele University (KU) >

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning straight away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?