100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Class notes

Class notes 70756

Rating
-
Sold
-
Pages
25
Uploaded on
24-11-2022
Written in
2022/2023

Genomics and Proteomics Genomics and proteomics are closely related fields. The main difference between genomics and proteomics is that genomics is the study of the entire set of genes in the genome of a cell whereas proteomics is the study of the entire set of proteins produced by the cell.

Show more Read less
Institution
Course

Content preview

Genomics and Proteomics

, Feature


Genomics and Proteomics:
A Signal
Processor’s Tour
P. P. Vaidyanathan




Abstract
The theory and methods of signal pro-
cessing are becoming increasingly
important in molecular biology. Digi-
tal filtering techniques, transform
domain methods, and Markov models
have played important roles in gene
identification, biological sequence
analysis, and alignment. This paper
contains a brief review of molecular
biology, followed by a review of the
applications of signal processing the-
ory. This includes the problem of gene
finding using digital filtering, and the
use of transform domain methods in
the study of protein binding spots.
The relatively new topic of noncoding
genes, and the associated problem of
identifying ncRNA buried in DNA
sequences are also described. This
includes a discussion of hidden
Markov models and context free
grammars. Several new directions in
genomic signal processing are briefly
outlined in the end.
© EYEWIRE




Keywords—Genomic-signal-process-
ing, bioinformatics, genes, protein-
coding, DNA, and ncRNA.

6 IEEE CIRCUITS AND SYSTEMS MAGAZINE 1531-636X/04/$20.00©2004 IEEE FOURTH QUARTER 2004

, 1. Introduction



S
ubsequent to the sensational announcement of
the double helix structure for the DNA molecule
more than fifty years ago by Watson and Crick [1], G C
there has been phenomenal progress in genomics in the
last five decades. With the enormous amount of genom- G C
ic and proteomic data available to us in the public T A
domain, it is becoming increasingly important to be able G C
to process this information in ways that are useful to
A T
humankind. Traditional as well as modern signal pro-
A T Sugar Phosphate
cessing methods have played an important role in these
Backbone
fields. Genomic signal processing is primarily the pro- 3.4nm
cessing of DNA sequences, RNA sequences, and pro- or 34 Å C
G
teins. A DNA sequence is made from an alphabet of four T A
elements, namely A, T, C, and G. For example C G
A T
. . . ATC C C AAGT AT AAG AAGT A . . . G C


The letters A, T, C, G represent molecules called A T
nuclotides or bases (to be described soon). Since DNA
contains the genetic information of living organisms, we
see that life is governed by quarternary codes. Another (a)
example of discrete-alphabet sequences in life forms is
the protein. A large number of functions in living organ- 5′ 3′
isms are governed by proteins. A protein can be regard- C T
A G A G A A
ed as a sequence of amino acids. There are twenty
distinct amino acids, and so a protein can be regarded as G A
T C T C T T
a sequence defined on an alphabet of size twenty. The 3′ 5′
twenty letters used to denote the amino acids are the let-
ters from the English alphabet except B, J, O, U, X, and Z. A T C G
For example a part of the protein sequence could be (b)


. . . PPV AC AT DE E D AF G G AY PQ . . .
Phosphate Sugar Sugar Phosphate Backbone

Notice that some letters representing amino acids are 5′ 3′
identical to some letters representing bases. For example C A Base
Base G G
the A in the DNA is a base called adenine, and the A in the Sequence
protein is an amino acid called alanine.
Nucleotide
If we assign numerical values to the four letters in the = Base+Sugar+Phosphate
DNA sequence, we can perform a number of signal pro- (c)
cessing operations such as Fourier transformation [26, 3],
digital filtering [27], time-frequency plots such as wavelet Figure 1. (a) The DNA double helix, (b) linearized schemat-
transformations [17], and Markov modelling [4]. Some of ic, and (c) details of the sugar-phosphate backbone. In part
(b) bottom strand is complementary to the top strand in
those are quite interesting and in fact have important
the sense that A and T are paired and so are C and G. This
practical applications. Similarly, once we assign numeri- is because of a weak bonding called hydrogen bonding
cal values to the twenty amino acids in protein sequences between these pairs of molecules.
we can do useful signal processing.

P. P. Vaidyanathan 1 is with the Department of Electrical Engineering, 136-93, California Institute of Technology, Pasadena, CA 91125. Email:


1 Work supported in part by the ONR grant N00014-99-1-1002.


FOURTH QUARTER 2004 IEEE CIRCUITS AND SYSTEMS MAGAZINE 7

Written for

Institution
Course

Document information

Uploaded on
November 24, 2022
Number of pages
25
Written in
2022/2023
Type
Class notes
Professor(s)
N/a
Contains
All classes

Subjects

$42.99
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
abduljaleelktr7924

Get to know the seller

Seller avatar
abduljaleelktr7924
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
3 year
Number of followers
0
Documents
4
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions