overview on bioinformatics
bioinformatics to deal with the applications of the different science in biological systems so we look into the different fields of science how the different fields contribute to the birth and the development of bioinformatics for example computer science can you tell one example how the computer science contribute to bioinformatics machine learning can be used probably and in computer science there are a lot of computer algorithms have been used for solving the biological problems correct so you can develop download several programming’s and you can extract the hidden data available in biological information’s also you can use different machine learning techniques and the algorithms to understand and to capture the information as well as for the prediction purposes so can you tell one example how the mathematics or statistics is used to the development of bioinformatics
example would be use of statistics
for an example in the use of plot they
plot the phi and psi angles of protein by whole correct so you can use mathematics
right to derive some principles and relate the data for example the protein
sequences or how the distribution of the rest use in the plot and so on using
this mathematics and you verify the data whether you obtain any model so
whether they are statistical significant or not you can relate with the
correlation analysis you can relate the regression techniques as well as you
can see whether the data are statistically significant or not so now if you use
the information technology how the information technology contribute to the
development of mathematics there has been a increased computational resources over the decades which
is used to do a large scale data analysis correct so you can use for a logic
analysis you can develop the online resources right you can see the computer
storage and so on in this case that will enhance the applications of
bioinformatics to various fields likewise physics if you talk about physics so
you can
see the concept of various types of interactions like electrostatic
interactions interaction auto interactions how these interactions are important
to understand to the folding mechanism proteins for example if you take protein
folding in the unfolded state of a protein so it is it is like wobbling and you
can see it is very like a random coil confirmation
when this protein folds in the specific three dimensional structures it
can form a specific three d structures and how a protein can attain a specific
three d structures from its sequence right this can be explained by various
types of interactions like disulphide bonds electrostatic interactions front
verse interactions so understand the principles governing the folding state of
a protein it requires the physical concepts so you can use physics to
understand the mechanism of the folding of in a of a proteins so we consider
all the fields if we say life science computing maths stat information
technology or physics or chemistry which one is the some major field for the
birth of bioinformatics life sciences life sciences right because we need the
data so without data even if you have several fields we cannot apply it to
different data right so we need a specific data so data where shall we get the
data so we can get the date from biological experiments so if we look into this
life sciences there are produced a lot of data right on a various aspects such
as the macromolecule sequences for example dna sequence or protein sequence
right and the structures protein structures dna structures complex structures
and so on different expression profiles different pathways and so on so how the
bioinformatics is used to understand the data so bioinformatics is help to
acquire the data to manage the data and to analyze the data and to understand
the data right so bioinformatics is the major field right to understand the
concepts and understand to the hidden information available by the experiments
produced in life sciences so now what are the various aspects how the
bioinformatics is grown what are the various applications of bioinformatics how
the bioinformatics contribute a society that there are various aspects so i
briefly so i briefly i put into five bullet points the first one is well
organized databases the biologists they do the experiments and they produce
data and theyt publish the data and the literature it is very important to
collect all the information and put in the form of database for example if data
are available scattered here and there right so it is very important to collect
all the information and put in a proper form right and then you give some
options to extract the data from this database this is what the bioinformatics do
to develop a plenty of databases which are well organized or in computable form
once we have to data right submission number of data which is very essential
and it is required for any analysis right otherwise you do not get any
statistical significant data so second option is once you have the database you
can derive hypothesis what will happen what is the relationship between some
specific features as well as any function if you know the function or if you
know any specific characteristics of any biological systems what makes these
characteristics to a particular systems right so here we try to develop some
features right and using these features you can link the function of any biological
system right for example if you take a protein sequences or protein structures there
are different types proteins say for example if you get a protein structures
globular proteins membrane proteins and different types of globular proteins say
alpha proteins beta proteins and so on and whether we can identify these types
of proteins from a pool of sequences right so you can left side you can see the
sequences because if you have sequence you know the information regarding the
amino acid residues right so you can get the number of residues of which type
so there are twenty different types of residues right present in these proteins
and you can relate these fields there is dominant of some specific residues for
example hydrophobic residues then you can say that this could be a protein
likewise for any proteins having different functions or different type of
diseases right so you can derive the hypothesis what is the basic principle for
having this biological systems once you derive the hypothesis to understand ok
these are the major factors for any specific systems right the next step is
whether we are able to describe any algorithm whether we are be able to derive
any algorithm right so if you see the features and if you see the functions and
if you carry the relationship so here you have the features this side here you
have the function right whether we can relate these features and function then
what is the mathematical equation to characterize the function in terms of
features right so we can do a function right to understand the function from
the features these when you do these we can make the algorithm once algorithmic
study then we can use it for public in this case we use web servers or online
applications right with the earlier days when the internet was not fast enough
so at that time see everyone they create the servers they create their own
algorithm they keep themselves it is difficult to transfer but currently due to
the advancements of these computers and
biology and computational techniques and fast internet facilities several observers have
been developed to give the applications to the others right in our laboratory
also we have developed various tools which are widely used in the literature so
many people use these databases and as well as the tools try to understand any
biological systems so the fourth one so i will little bit discuss about the
virtual screening so for example in the case of drug design currently it is
very popular because there are a lot of small molecules are available in the
literature and the people are affected with the several types of diseases like
the cardiovascular diseases cancer and so on and currently were developed with
the chikungunya dengue and so on so in all these cases to identify drugs they
try to find a target and then we see what are the functions of the particular
target what are the actions what are the important residues right and then we
try to inhibit
that activity so that we can reduce these disease so here is one example
for the structure based drug design so if you have a protein so this is a
target so here i show a target of c yes kinase because the these c yes kinases
are very important for several cellular activities right so there are several
kinases one is the c yes kinase here this is very important
for the colorectal cancer so this is an attractive target to define the inhibitor for the colorectal
cancers to do this there are various options how to derive a particular to be
an inhibitor right so it is a very large pool so how to derive it so in this case i
will tell one example so finding a fish
in a pond so if you see it is a pond here so can you see different fishes in a
different ponds so if you want to catch a fish where will you put your net in
the number one right number one or number two if you put you will get a fish if
you put your net in number four so you will not get anything you only will spend
much time you will not get anything right so if some of you will tell you that ok
you are catching trying to catch a fish for long time ok so you try to use this
one to put that in a particular side then if you do it and if you get a fish
then you will be very happy right because you do not have to waste your time
(Refer Time: 12:00) so this is the case so if for any disease so there are
different compounds for example
compound one compound two compound three and so on if we take compound
one and you try this is failed so it is not a drug and you have compound two
this is also not a good drug and then compound three is probably a drug right
there are millions of compounds if you look into the literature right so there
are a lot of compounds for example if you
go to a zinc database there are thirty five million compounds and in the
enamine database two point two million compounds and in the natural compounds
in the chinese medicine thirty five thousand compounds so if you want to try
one by one right when you try everything then the patient will die at that time
second case if you try to use each compounds experimentally if will take long
time it needs long manpower and also it needs it is lots of money right so in
this case how to do it so among these thirty five million compounds if someone
can
reduce to thirty five thousand compounds then the number of experiments
will be reduced by one thousand times so if you instead of this two point two
million compounds we have hundred compounds or thousand compounds then you can
reduce enormously the search option how to do it so here is a solutions
bioinformatics can do it because currently we have very fast computers and we
have very good techniques so it can assist one to searching drug target and
designing drug for many millions of compounds and with that second one is how
to use hypothesis ok here i show you an example so here i have five of known
values so experimentally known so we take the example of number one the point
one so rice fifty percent wheat twenty five percent meat ten percent fruits ten
percent and vegetables five percent like if you do this so then we can see that
this is not controlled for example take the food pattern and weight control and
go over the second one so rice thirty percent wheat five percent meat ten percent
fruits thirty percent and vegetable twenty five percent in this case also it is
not controlled and if you go for the third one rice thirty five percent wheat
ten percent meat ten percent fruits seventy five percent and vegetables thirty
percent here it is controlled likewise the five examples so now i have a test
case ok this is a test case right to another one consumes twenty percent rice
ten percent wheat fruits twenty percent meat ten percent vegetable forty
percent right so now the question is here it is controlled or not what is the
answer is controlled right it is correct so why it is controlled how do you
know this is controlled can you tell one example the fifth point vegetables are
vegetables vegetable here also are twenty five percent but it is this is not
controlled but here vegetables is less so we can derive some principles you can
derive some equation
series statistics right so show one example right we can say that if
answer is controlled you are right ok so we can see the right hand wheat is
less than thirty five percent and meat is less than fifteen percent and
vegetables more than thirty percent so likewise you can derive several
equations right you can properly study the initial data sets experimental data
sets from this experimental data sets you can derive some some equations right
some conditions where this will fit and apply these conditions to any set of
data and then you can see whether that is controlled or not so likewise with
bioinformatics can handle large amount of data and provide possible solutions
ok so i will explain a little bit more about the virtual screening of the
compounds here is show one example how we use the virtual screening to
understand the drug design so it is shown an example here is the protein right
this
is the c yes kinase in a protein so there are different domains so we
can see one domain left side that is a
domain s s three domain and s s two
domain and here this is the and the here is the catalytic domain and here is a
phospho relation side the tyrosine four and six and this c lobe the question is
ok this is very is very important for the colorectal cancer this is a target so
it is important to identify a probable hit target for this particular c yes
kinase enzyme so this is one aspect here one side we have the protein so you
have the target to c yes kinase and the other side so how to design a inhibitor
so they have a library of enamine
library you have two point two million compounds among the two point two million
compounds how to choose the probable compounds which can be a lead compound for
a drug so and if i do it for a two point two million compounds it can take long
time it takes lot of money because it is
compound cost of thirty to forty thousand rupees right so it will do this
spend time to do for all the two point two million compounds so how to do that
so in this case you can derive some methodology first you see whether the
structure is known if the structure is known then you can use the particular
structure if the structure is not known then you may need to model this
structure and then we have to stick for the activation sides where are the
activation sides they will again combine
and see this pockets which are the binding sides right here is the side so
now we see two point two million compounds you check the features of all the
compounds and make some conditions to fit with this particular a pocket right
then you can use some molecular weight you can use the hydrogen or the
acceptors right various options you take and then you eliminate the compounds
finally you can use a virtual screening like
docking you can do with these compounds and finally you derive some
compounds so in two thousand four the turkey institute of technology organized
a competition to identify a inhibitors for this particular target we also
contributed in that right we it is identified about one twenty compounds and
they tested fifty compounds summing to one twenty that and we showed that four compounds showed
inhibition and one is the probable hit compound they continued the same in the
next year two thousand fifteen right there are about two thousand compounds
right they found five showed inhibitions and to are hits so in the down side of
floor we can see this figure i show they how they and interact with the protein
you can see the green ones right so the green shows the and the surroundings ones
are the protein side so these they they have some specific interactions you can
see the hydrogen bonds or the hydrophobic interactions and the interactions and
they because of these interactions they tightly binds with the protein and they
they act as an inhibitor for these particular kinase in the later classes i
will explain about the more details on the structure based drug design so till
now we discussed few aspects of bioinformatics so what are the asp different
aspects we discussed databases the one is well organized databases
bioinformatics contribute to organized databases right so then the second one
computationally derived hypothesis when you have the data then you can develop several function right so to relate
the features and the functions right and once we derived the hypothesis then we
can develop several algorithms right for the prediction and then once we
predict then we can make it as online applications in the form of web severs
there are several web servers for example protein structure prediction protein
function prediction right so how the dna can bend and how the dna can interact with the proteins and
so on then the fourth one we discussed now regarding the virtual screening of
compounds how they do the screening for the drug development and currently if
you see the bioinformatics is widely applied in next generation sequence
analysis now all are interested in personalized medicine few years ago it was
very expensive to sequence it you know now it is very cheap to get a sequence
right so everyone wants to see there you know and what are the proteins they
have and what are the functions they will do and to understand what are the
probability of having any specific mutation to your protein and so on so in this
case now currently we have a lot of data right from obtained from next
generation sequences right for example the illumine sequencing right so we can
due to the advancements in the sequencing techniques now there are a lot of
data available in the literature but the question is how to analyse the data so
how to extract information from this specific sequences right so there are
several ways to get the sequences they have a usage short reads and to get the final
sequence for example if you some patients which are affected with a cancer or
affected with the parkinson disease and alzheimer disease and so on how they
different from healthy individuals so they get the data for the patients and
they get the full sequence and they get the data from the individual healthy
individuals and they compare so what how are the features what are the
variations or the mutations right where the mutations are in the protein coding
regions or non-coding regions and then they relate how these mutations are or
these residues are important why are they are involved in different pathways
and how they are influencing the different diseases they try to see the
information and then they go with the treatment for example if you are affected
with a cancer or any specific diseases they are treated so they go to the
hospital they get the patients data as well as they get the information
regarding drug and the drug response and they make a database you can do it
right so from that information you can see that if in the specific variations
ok this specific drug will work so we have this information then this will be
helpful for the personalized medicine for different for different diseases
likewise which is bioinformatics plays a major role on different aspects in
human health as well as for medicine
Comments
Post a Comment