Conformational folding status and folding levels based on global protein properties a computacional approach [recurso electrónico]
Proteins during the folding process undergo many events that continuously modify their structure adjusting their structural elements, and so altering their physical properties. These events are hard to observe by experimental methods, but computer simulations can generate valuable information about the conformational states by which proteins assume their native state. The information can be used to describe and understanding the folding process, hence, researchers in the protein folding use a single or a combination of physical properties to describe how a protein folds or what degree of folding it has. But their results differ, depending on the selected properties, the particular context, and the aimed generality. We hypothesize that folding can be described by a few features that determine how a protein folds and that are associated with the stages of folding it passes through on the way, from unfolded to folded and intermediates. The set of folding features is what we will refer to as folding status, and we propose here a computational method to combine a wide range of different physical structural and energetic properties to obtain a concise representation of a conformations folding features. Also, as the folding status is associated with the stages of folding, we propose a mechanism for determining the most plausible conformational stages assumed by a protein conformation or what we refer to as folding levels. Therefore, the objective of this thesis is to derive a computational definition of the folding status of a protein and the associated folding levels, based on physical properties of protein structures. We use protein folding pathways generated by the Probabilistic Roadmap Method that provides us with a set of very variable protein conformations. On these pathways, we apply a Principal Component Analysis to define Inherent Conformational Features that allow to describe a protein¿s folding status in a compact way. The obtained features summarize the individual properties of a protein conformation and associate with general folding characteristics of stability, compactness, and native-likeness. The features can be used to compare the conformations of a pathway or trajectory, or to compare the folding status of the conformations of different proteins. From the features, we derived the Inherent Conformational Feature Score, which condenses the three-dimensional feature status to a one-dimensional numeric value: the higher the score, the more folded is a protein conformation. The features allow to deduce folding levels and characterize their respective conformations in a computational way. Clustering our selected conformations, we obtain four well-defined groups that we associated with four main folding levels: unfolded, early intermediate, late intermediate and folded. These folding levels agree with experimentally observed folding states of proteins. And moreover, they reflect concisely the dynamic behavior of a pathway when we represent the folding process in terms of its dynamic transitions between folding levels. The process of evaluating the selected properties on a large set of conformations is computationally highly expensive. With the aim of offering our methods to researchers who do not dispose of a sophisticated computing system, we developed a distributed framework that runs on personal computers using a cloud service for communication. We added a toolkit for protein analysis to the framework that allows to execute all above stated tasks.