Skip to search boxSkip to navigationSkip to main content

Protein data modelling for concurrent sequential patterns

  • Jing Lu
    ,
  • Malcolm Keech
    ,
  • Cuiqing Wang
  • Solent University
    ,
  • Shenyang Institute of Chemical Technology
Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Abstract

Protein sequences from the same family typically share common patterns which imply their structural function and biological relationship. The challenge of identifying protein motifs is often addressed through mining frequent itemsets and sequential patterns, where post-processing is a useful technique. Earlier work has shown that Concurrent Sequential Patterns mining can be applied in bioinformatics, e.g. to detect frequently occurring concurrent protein sub-sequences. This paper presents a companion approach to data modelling and visualisation, applying it to real-world protein datasets from the PROSITE and NCBI databases. The results show the potential for graph-based modelling in representing the integration of higher level patterns common to all or nearly all of the protein sequences.

Publication Information

Output type

Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Original language

English

Publication milestones

  • Published - 04/12/2014

Publication status

Published - 04/12/2014

Publisher

Institute of Electrical and Electronics Engineers Inc., United States

ISBN (Electronic)

9781479957224

External Publication IDs

  • handle.net: 10547/334492
  • Scopus: 84919389779

Host publication title

2014 25th International Workshop on Database and Expert Systems Applications

Publication metrics