C1orf94

C1orf94

Introduction

Chromosome 1 Opening Reading Frame 94, commonly referred to as C1orf94, is a protein encoded by the C1orf94 gene located on the short arm of human chromosome 1. Despite its identification and characterization, the specific function of the C1orf94 protein remains poorly understood. This article delves into various aspects related to C1orf94, including its genetic structure, expression patterns, potential interactions, and implications in health and disease.

Gene Structure and Location

The C1orf94 gene is situated at the locus 1p34.3 on chromosome 1, spanning the genomic coordinates chr1:34,166,883 to 34,219,131. It is encoded on the sense strand and comprises seven exons, with six of these exons coding for the protein. The gene is also known by several aliases including Q6P1W5, B3KVT1, D3DPR3, E9PJ76, Q96IC8is, and MGC15882. Notably, it shares an alias with the FLJ20508 gene.

mRNA Isoforms

C1orf94 produces two isoforms: isoform a and isoform b. Isoform a is the longer variant comprising 598 amino acids. The presence of multiple isoforms can suggest functional diversity within a single gene’s expression profile.

Transcriptional Regulation

C1orf94 is regulated by two predicted promoters; however, only one has been confirmed for use in analyses. Various transcription factors have been identified that bind to specific sites on the gene promoter region. These include ZF02 (a member of the C2H2 zinc finger transcription factor family), Cart1 (a sequence-specific DNA-binding transcription factor), HTLV-I U5 repressive element-binding protein 1, NKX homeodomain factors, and AARE binding factors PREB core-binding element.

Protein Characteristics

The C1orf94 protein contains a domain known as DUF4688, which is conserved across eukaryotic species and present in both isoforms a and b. This protein has a molecular weight of approximately 65.35 kDa and an isoelectric point of 8.56. The amino acid composition reveals that proline is the most abundant amino acid (11.7%), followed closely by leucine (10.4%). The presence of seven PEST motifs suggests that C1orf94 may have a short intracellular half-life due to its rich content of proline (P), glutamic acid (E), serine (S), and threonine (T).

Post-Translational Modifications

C1orf94 undergoes several post-translational modifications including palmitoylation, phosphorylation, and glycation predominantly at its N-terminus. Additionally, there is a predicted mitochondrial processing peptidase cleavage site at the first methionine residue that may play a role in the protein’s maturation process.

Protein Structure

The secondary structure predictions for C1orf94 indicate the presence of alpha helices, extended strands, beta turns, and random coils. Tertiary structure modeling via Phyre2 and SWISS-MODEL suggests that C1orf94 exists as a monomeric protein. Structural analogs identified by I-TASSER include proteins such as 3IXZ (pig gastric H+/K+-ATPase complexed with aluminum fluoride) and 3B8E (the crystal structure of the sodium-potassium pump), which may provide insights into its functional mechanisms.

Protein-Protein Interactions

C1orf94 has been shown to physically interact with several proteins. Notably, it interacts with ATXN1, a chromatin-binding factor that plays a role in repressing Notch signaling pathways when the Notch intracellular domain is absent. Affinity chromatography studies have demonstrated interactions between C1orf94 and MMADHC—an important mitochondrial protein involved in vitamin B12 metabolism. Furthermore, RFX2 has been suggested as a potential functional partner based on STRING database interactions; this transcription factor plays a critical role in spermatogenesis.

Expression Patterns

According to AceView data, C1orf94 exhibits moderate expression levels relative to average genes but shows significant expression in testicular tissues according to NCBI data. There is also slight expression detected in brain tissues via the Human Protein Atlas. Additionally, expression profiles from GEO suggest that levels of C1orf94 increase significantly in association with morbid obesity and after coactivator depletion events.

Functional Implications

The precise biological function of C1orf94 remains elusive; current research has yet to validate any definitive roles for this protein. However, expression patterns indicate higher activity in normal tissues compared to fetal development stages. This suggests that C1orf94 may play important roles in adult cellular functions or processes rather than during embryonic stages.

Association with Diseases

C1orf94 has been implicated in oncogenic processes as indicated by Genome-Wide Association Studies (GWAS). It was classified as an OncoORF due to its involvement in various protein-protein interactions that are relevant to cancer biology—most notably colorectal cancer. Interactions involving AKAP9 kinase anchor protein have been highlighted due to their potential roles in promoting colorectal cancer development by regulating proteins like Cdc42 interacting protein.

Evolutionary Perspective

When examining evolutionary relationships through sequence homology analyses using NCBI BLASTp, it appears that C1orf94 evolved at a rate faster than cytochrome c but slower than fibrinopeptides. Interestingly, no paralogs have been identified for this gene; however, orthologs among mammalian species exhibit significant conservation while more distant orthologs are found in fish species.

Amino Acid Composition Consistency

An analysis of orthologs from various species—such as gorillas, rats, dogs, and bats—reveals only minor variations in amino acid composition compared to humans. Proline remains the most abundant amino acid across these sequences followed by leucine while tryptophan consistently appears as one of the least abundant residues.

Conclusion

C1orf94 represents an intriguing area of study within human genetics given its unknown functional roles despite being well-characterized at both genetic and proteomic levels. As research continues to explore its interactions and expression profiles further understanding could emerge regarding its biological significance and potential implications for health conditions like cancer and metabolic disorders. Ongoing investigations will be essential to elucidate the mysteries surrounding this gene and its encoded protein.


Artykuł sporządzony na podstawie: Wikipedia (EN).