|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 281, Issue 33, 23676-23685, August 18, 2006
Characterization of Human Mucin MUC17COMPLETE CODING SEQUENCE AND ORGANIZATION*From the Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska 68198
Received for publication, January 11, 2006 , and in revised form, May 8, 2006.
With increasing interest on mucins as diagnostic and therapeutic targets in cancers and other diseases, it is becoming imperative to characterize novel mucins and investigate their biological significance. Here, we present the completed coding sequence and genomic organization of the previously published partial cDNA sequence of MUC17. Rapid amplification of cDNA ends with PCR, sequences from the Human Genome databases, and in vitro transcription/translational assays were used for these analyses. The MUC17 gene is located within a 39-kb DNA fragment between MUC12 and SERPINE1 on chromosome 7 in the region q22.1. The full-length coding sequence of MUC17 transcribes a 14.2-kb mRNA encompassing 13 exons. Alternate splicing generates two variants coding for a membrane-anchored and a secreted form. The canonical variable number of tandem repeats polymorphism of the central tandem repeat domain of the MUC genes is not significantly detected in the MUC17 gene. In addition, we show the overexpression of MUC17 by Western blot and immunohistochemical analyses in pancreatic tumor cell lines and tumor tissues compared with the normal pancreas. The expression of MUC17 is regulated by a 1,146-bp fragment upstream of MUC17 that contains VDR/RXR, GATA, NF B, and Cdx-2 response elements.
Mucins, the main components of the mucus, are high molecular weight O-glycoproteins expressed and secreted by epithelial cells and, in some cases, by endothelial cells (1, 2). Their principal function is to protect and lubricate the epithelial surfaces; however (1), recent reports also demonstrate that mucins and, more specifically, membrane-bound mucins have a role in the signal transduction and oncogenic processes (1, 3, 4). Presently, 20 mucin (MUC)3 genes have been identified and are named MUC12, MUC3A/B, MUC4, MUC5AC, MUC5B, MUC6 13, and MUC1520 (2, 58). These mucins have been grouped into two subfamilies, the secreted and the membrane-bound. The secreted mucins are exclusively expressed by specialized epithelial cells and exhibit a restricted pattern of expression within the human body (1, 2). The membrane-bound mucins are expressed at the apical region of epithelial cells under normal conditions and have a wide expression (1, 2). Moreover, alternative splicing and proteolytic cleavage can lead to the generation of three distinct forms of the transmembrane mucins, such as soluble (proteolytic cleavage of the membrane-bound form), secreted (alternatively splice variants), and one lacking the tandem repeat domain (alternatively spliced variants) (912). The ratio of one form to another shows tissue specificity and is associated with the physiologic condition (13, 14).
MUC17, a membrane-bound mucin, was recently identified and located in the mucin cluster at the chromosomal locus, 7q22 along with MUC3A/B, MUC11, and MUC12 mucins (5, 15, 16). The first partial length cDNA sequence, now known to correspond to MUC17, was identified by Van Klinken et al. (17), who reported five tandem repeats, each encoding 59 amino acid residues, located upstream of the 17 tandem repeat residues of MUC3. Both sequences, repeated in tandem, were identified on the same cDNA fragment. However, after the characterization of the full-length sequence of MUC3A and MUC3B, the clone isolated by Van Klinken appeared as a chimera cDNA fragment, composed with an unknown gene sequence fused to the MUC3 tandem repeat sequence. In 2002, driven with the hypothesis that the five 59-amino acid residue tandem repeat sequences were part of a new unidentified mucin, Gum et al. (5) screened the public GenBankTM data base and the proprietary Lifeseq Gold data base (Incyte Genomics, Inc., Palo Alto, CA) and identified the 59-amino acid tandem repeat downstream sequence. The authors reported a partial cDNA fragment of 3,803 bp (accession number AF430017 [GenBank] ) composed of five repetitions of a 177-bp motif upstream of a non-repetitive sequence. The deduced amino acid sequence presented characteristics of membrane-bound mucin with the presence of five repetitions of the 59-amino acid residue motif, followed by two EGF-like domains, an/a SEA domain, a hydrophobic transmembrane domain, and an 80-amino acid long cytoplasmic tail. The new mucin gene, called MUC17, was localized to chromosome locus 7q22 along with MUC3A/B and MUC11/12. Herein, we report the complete characterization of the MUC17 gene and its transcripts along with the deduced structural organization of the protein. Our study shows that MUC17 is expressed in at least two alternatively spliced forms encoding for membrane-bound and -secreted forms. Moreover, inter-individual VNTR polymorphism is also observed, giving rise to three allelic forms. Furthermore, we report that the intergenic region (1146 bp) between MUC12 and MUC17 possesses both basic and enhancer regulatory elements and may be responsible for cell-specific regulation of MUC17. A differential expression profile of MUC17 was observed in pancreatic tumors compared with the normal pancreas.
Tissue Specimens and Cell LinesA total of 24 established cancer cell lines (pancreas, colon, and breast) were used as sources of genomic DNA. Additionally, four genomic DNA samples were extracted from peripheral blood mononuclear cells of healthy individuals to validate the results obtained using the cancer cell lines. Samples were collected under protocol approved by the Institutional Review Board at the University of Nebraska Medical Center, Omaha, NE. Informed consent was obtained from all subjects. 5'-Rapid Amplification of cDNA Ends PCRThe 5'-RACE kit (Roche Applied Science) was used to synthesize first-strand cDNA from total AsPC-1 cell line RNA (2 µg) with specific MUC17 primer (RACE 171, GTGATAGCCTCTGAACTGGCC). Terminal transferase was used to add a poly(dA) tail to the 5'-end of the cDNA. RACE-PCR experiments were performed in 50-µl reaction volumes containing 5 µl of 10x buffer (100 mM Tris/HCl/15 mM MgCl2/500 mM KCl, pH 8.3), 5 µl of 10 mM deoxynucleoside triphosphates, 5 µl of poly(dA)-tailed cDNA, 0.2 µM of each primer (MUC17-specific RACE 172, CATGGTGCTGGCAGGCATACT), oligo(dT)-anchor primer (provide by the RACE Kit supplier), and 2 units of Taq DNA polymerase (MBI Fermentas, Hanover, MD). The mixture was denatured at 94 °C for 2 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 1 min, and 72 °C for 2 min. The final elongation step was a 15-min extension. A 1-µl amplification product was further amplified by a second PCR reaction using a MUC17-specific nested primer (RACE 173, GTAGGAGATGAACTTGCCTGA) and the PCR anchor primer (provided by the supplier). PCR products were electrophoretically resolved on 1% agarose gels and stained with ethidium bromide. Photographs were taken under UV light, using the GelExpert software (Nucleotech, San Carlos, CA). Amplification products were excised and purified with the QIAquick® gel extraction kit (Qiagen, Valencia, CA), cloned into pCR®2.1 vector (Invitrogen), and sequenced. Expand Long Template PCRTo identify potential MUC17 splice variants in the 3'-extremity, an RT-PCR reaction was performed using the Expand Long Template PCR system (Roche Applied Science) with sense (5'-CTGTGCCAAGAACCACAACAT-3') and antisense primers (5'-CTCCTCACTCCCAGACTTCTC-3'). Expand Long Template PCR was performed in 50-µl reaction volumes containing 5 µl of AsPC-1 cDNA, 5 µl of 10x buffer 3, 2.5 µl of 40 mM deoxynucleoside triphosphates, 0.2 µM of each primer, 0.75 mM MgCl2, and 2.5 units of polymerase mixture (Roche Applied Science). The reaction mixture was denatured at 94 °C for 2 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 1 min, and 68 °C for 4 min with the elongation time of the last 20 cycles extended 40 s for each cycle. The final elongation step was extended for an additional 30-min period. The amplification product was directly cloned into the pCR®2.1 vector (Invitrogen), amplified, and sequenced.
In Vitro Transcription and Translation AssaysAn amplification product was generated using forward primer 5'-GCCAGCTCCTCTGGGGTGAC-3' and reverse primer RACE 171 (described previously). The product was cloned in pCR®2.1 under control of the T7 promoter. The DNA contained a coding region for a peptide with a predicted size of 36 kDa preceded by a putative Kozak sequence, followed by an ATG as well as 25-residue N-terminal signal sequence. Transcription and translation experiments were performed with the TNT® Quick Coupled Transcription/Translation System (Promega, Madison, WI) according to the manufacturer's instructions. The amino acid mixture containing [35S]methionine (1000 Ci/mmol) was used for in vitro translation, and the product was analyzed by SDS-PAGE. Negative controls consisted of a MUC17 sequence cloned in the opposite direction and an empty vector. The
Southern Blot AnalysisGenomic DNA, isolated from the 24 human tumor cell lines and peripheral blood mononuclear cells, from four healthy individuals, was digested with EcoRI and HindIII restriction endonucleases. Digested products were resolved by electrophoresis in 0.8% agarose gels and transferred to nylon membranes. Membranes were hybridized with a MUC17 tandem repeat probe. The probe (3 kb) was generated by PCR amplification using the MUC17-TR forward: GATATGAGCACACCTCTGACC and MUC17-TR reverse: ATGTTGTGGTTCTTGGCACAG primers, cloned in pCR®2.1, and sequenced before use. The probe was radiolabeled using the Random Primers DNA Labeling System (Invitrogen) and
Assay of Transcriptional and Enhancer ActivitiesTransient transfections of MUC17 promoter/enhancer luciferase reporter constructs were performed using the pGL3-basic and -enhancer vectors (Promega). Five constructs overlapping exon 11 of MUC12 to the MUC17 5'-untranslated region were prepared using AsPC-1 DNA. For transfections, cells were seeded into 6-well plates, in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum. Transfections were carried out using Lipofectamine (Invitrogen) according to the manufacturer's instructions. Transfection conditions were optimized for each cell line. Cells were co-transfected with pSV-
Identification and Characterization of the Central Repetitive Domain of MUC17To extend the MUC17 sequence toward its 5'-extremity, the known repeated 177-bp motif (characteristic of the tandem repeat array of MUC17), was positioned on the sequence corresponding to the chromosomal locus 7q22 using the map viewer interface of the human genome resources data base (National Center of Biotechnology Information, Bethesda, MD). The MUC17 sequence was localized to the BAC clone RP11395B7 (accession number AC105446 [GenBank] ). Altogether, the RP11395B7 clone contained 60 repetitions of the 177-bp motif, directly downstream of 600 bp of degenerated repetitive sequence.
The characteristics of MUC17 central domain i.e. length and VNTR polymorphism, were investigated by Expand Long RT-PCR and Southern blot analysis. Expand Long PCR that allows amplification of a large DNA fragment (up to 30 kb) was performed on AsPC-1 cDNA using sense and antisense primers that recognize both extremities of the MUC17 tandem-repeat domain (Fig. 1A). The amplification product was resolved on a 0.8% agarose gel. Numerous amplification products were detected ranging from 1.5 to 8 kb (Fig. 1B), which were expected due to the repetitive nature of the amplified sequence. The largest amplification product detected, with a molecular size of 8 kb, should represent 45 repetitions of the 177-bp motif. No amplification product was, however, detected with a size of 10.8 kb, which is the expected size for the full-length tandem repeat domain. Analysis of the BAC RP11395B7 sequence suggested the presence of a HindIII site at 5434 bp upstream of the tandem repeat array and an EcoRI site at 1128 bp downstream of the repetitive sequence, encompassing a fragment of 17.4 kb (Fig. 1A). Genomic DNA, purified from 24 cancer cell lines (pancreas, colon, and breast) and from four healthy individuals, was digested with Hin-dIII and EcoRI endonucleases and probed with a MUC17 tandem repeat-specific probe. The 3-kb amplification product shown in Fig. 1B was cloned in PCR2.1 vector, sequenced, and used as a probe in Southern blot analysis on 28 genomic DNA samples. The probe contained 17 repetitions of the 177-bp motif. A low degree of VNTR polymorphism was observed, with only three bands detected with an approximate size of 17 kb (Fig. 1C). The size of the bands detected was consistent with the size of tandem repeats observed in the BAC clone (RP11395B7) sequence which contained 60 repetitions of the 177-bp motif. These three bands were considered to result from various allelic forms of MUC17 (alleles A, B, and C), exhibiting VNTR polymorphism. Of the 24 cancer cell lines and 4 control DNA samples, 22 were homozygous for MUC17. Two cell lines have the allelic form A (higher molecular weight), whereas the remaining 20 cell lines have the allelic form B (intermediate molecular weight). None of the cell lines have the allelic form C. Of the remaining six samples heterozygous for MUC17, allelic form B was common in all samples. Four of six samples were heterozygous with the allelic forms A and B, whereas the two other samples were heterozygous with the allelic forms B and C (lower molecular weight). The frequency of the allelic forms was 14.3% for the allele A, 82.1% for the allele B, and 3.6% for the allele C.
The average size of each allelic form (
Identification and Characterization of the 5'-Extremity and Genomic Organization of MUC17Combining the information deduced from the BAC clone RP11395B7 sequence and the partial cDNA sequence of MUC17 (AF430017 [GenBank] ), the MUC17 gene was precisely located at chromosome 7 in the region q22.1, oriented from centromere to telomere, between the MUC12 and the SERPINE1 (serine proteinase inhibitor) genes (Fig. 2A). The 5'-RACE-PCR was performed on total RNA from AsPC-1, a pancreatic adenocarcinoma cell line that expresses a high level of MUC17, using three antisense primers localized in the degenerate sequence upstream of the tandem repeat array (sequence given under "Materials and Methods"). Several amplification products were detected with size varying from 100 to 1000 bp for the first PCR, and from 200 to 700 bp after nested PCR (Fig. 3A). The detection of several amplification products during the 5'-RACE-PCR was expected for two main reasons. First, the MUC17-specific primers were localized in the degenerate sequence upstream of the tandem repeat array. Second, the size of the amplification products is directly dependent on the reverse transcription efficiency that gives rise to a multitude of cDNAs (partial or full copy of the mRNAs). Of these fragments, the largest cloned cDNA fragment (653 bp size) was sequenced. Its 3'-end overlapped the 5'-extremity of the degenerate repetition located upstream of the tandem repeat array. Comparison of the 5'-end of the RACE-PCR product with sequence of the BAC RP11395B7 clone led to the identification of two new exons. The compiled nucleotide sequences of the RACE-PCR clone, the 177-bp tandem repeat of the BAC RP11395B7, and the sequence identified and characterized by Gum et al. (AF430017 [GenBank] ) (5) allowed us to establish the complete organization of MUC17 (Fig. 2, B and C). This sequence has been deposited in EMBL (European Molecular Biology Laboratorie's Heidelberg, Germany) data base (AJ606307 [GenBank] ). Altogether, the MUC17 gene (39 kb) encodes an mRNA of up to 14,221 bp in size (due to the VNTR polymorphism) after splicing of 13 exons ranging in size from 61 to 12,185 bp (Table 1). The size of the introns ranges from 121 to 10,902 bp. The largest exon, E3, encodes the central domain and is composed of 60 repetitions of a 177-bp tandemly repeated motif. This exon codes for the main O-glycosylated domain of MUC17. Exon 1 (E1) of MUC17 is located 1,146 bp downstream of the 3'-extremity of the last MUC12 exon. The position of MUC17 E1 was further confirmed by PCR amplification on AsPC-1 genomic DNA using a forward primer located in the MUC12 last exon and a reverse primer located in the first MUC17 exon (data not shown). The similar amplification performed on AsPC-1 RNA (cDNA) did not allow us to detect any amplification product, showing that MUC12 and MUC17 genes are transcriptionally independent. Exon1 of MUC17 contains the 5'-untranslated region and sequence coding for the MUC17 signal peptide.
Full-length Coding Sequence of MUC17The translation initiation codon is located at position 54 in the sequence (AJ606307 [GenBank] ) and is preceded by non-consensus Kozak (18) sequence (AGAGCTCCGATG). A Kyte-Doolittle (19) hydropathy plot of the N-terminal extremity of MUC17 shows that the initial 25 residues encoded by exon 1 are very hydrophobic. Analysis using SignalP V1.1 software from the Center for Biological Sequence Analysis (Technical University of Denmark) predicted the presence of a signal peptide of 25 amino acids with a cleavage site located between positions 25 and 26 (AAAEQ). Fig. 2C shows a schematic representation of the MUC17 deduced amino acid sequence.
To confirm the functionality of the potential translation initiation site, the region upstream of the tandem repeat of MUC17 was amplified by PCR and subcloned in sense orientation downstream of the T7 promoter. The resulting construct was used for an in vitro transcription/translation assay (Fig. 3B). An expected 36-kDa protein corresponding to the MUC17 N-terminal region was detected. No protein product was detected for both negative controls. As a positive control, the N-terminal extremity of the Alternative Splicing of the MUC17 TranscriptThe presence of one or more alternative splice events in the 3'-extremity of MUC17 was investigated by RT-PCR. For this purpose, a forward primer was chosen in exon 3 (tandem repeat domain) and a reverse primer in the 3'-untranslated region (position and sequence given under "Materials and Methods"). Using these primers, RT-PCR was carried out on AsPC-1 cDNA. The generated amplification products were cloned into pCR®2.1 and sequenced. Two distinct fragments were identified through sequencing (Fig. 3C). One of the fragments was 100% identical to the previously referred sequence of MUC17 (accession number AJ606307 [GenBank] ). The second product revealed the occurrence of an alternative splice event that resulted from the skipping of exon 7. This alternative splice event generated a frameshift coding for 21 MUC17/SEC-specific amino acid residues and introduced a stop codon positioned 66 nucleotides after the alternative splice site junction. The resulting protein is the secreted form of MUC17 (accession number AJ606308 [GenBank] , MUC17/SEC), lacking the second EGF domain, the transmembrane domain and cytoplasmic tail. Several sets of primers were assayed along the 3'-extremity of MUC17 and RT-PCR carried out in four distinct cell lines (pancreatic AsPC-1 or colonic LS 174T, CaCo-2, and Ls 180) (Fig. 3C). Two amplification products were detected. Sequencing of the major amplification product identified it as the MUC17 sequence described by Gum et al. (5) with the accession number AF430017 [GenBank] , whereas the other amplicon (minor) corresponded to an alternatively spliced (skipping of exon 7) secreted form (AJ606308 [GenBank] ) of MUC17 (MUC17/SEC). The expression of MUC17/SEC was low in the cells investigated as compared with the major MUC17 membrane-bound form. The MUC17/SEC sequence was submitted to EMBL data base (AJ006308 [GenBank] ) and also appears in the GenBankTM data base.
Expression Analyses of MUC17 in Normal versus Diseased PancreasMUC17 expression was investigated in a panel of 2 normal pancreata, 8 pancreatitis, and 11 pancreatic adenocarcinoma tissue samples by RT-PCR. MUC17 was expressed in 81% of the tumor samples, whereas it was not detectable in either the normal pancreata or in the pancreatitis samples (Fig. 5A).
To confirm that MUC17 is detectable at the protein level, the 60 units of repetition composing the central domain of MUC17 were aligned to establish a consensus motif (Fig. 5B). A Hopp and Woods (21) hydrophilicity plot of this consensus sequence was carried out to delineate the antigenic region within these 59 amino acid residues. The synthetic peptide (showing the highest level of antigenicity), Pro-Thr-Thr-Ala-Glu-Gly-Thr-Ser-Met-Pro-Thr-Ser-Thr-Pro-Ser-Glu, was synthesized. An additional cysteine was added to the C terminus to boost antigenicity. Keyhole limpet hemocyanin was conjugated and served as a carrier for immunization of rabbits. Serum from the immunized rabbits showed a high antibody titer and good specificity by enzyme-linked immunosorbent assay, Western blot, and immunohistochemistry. Protein lysate from LS 174T and AsPC-1 cells served as a positive control (5), whereas lysate from PANC-1 cells served as a negative control. Immunoblot analysis revealed the presence of a single intense band for both AsPC-1 and LS 174T protein lysates, with an apparent molecular mass consistent with a protein of Characterization of the MUC17 Gene Regulatory SequenceHaving shown that MUC17 is abnormally expressed in pancreatic adenocarcinoma, further investigation of the MUC17 5'-flanking region DNA sequence was conducted to identify the regulatory sequence. No consensus promoter sequence was identified within this region with computer-based analysis, using both the PromoterInspector browser from Genomatix-Suite 3.4.1 software (Genomatix, Munich, Germany) and the PROSCAN Version 1.7 program from the Center for Information Technology, National Institutes of Health. The 1146-bp DNA fragment located in between MUC12 and MUC17 was cloned in the pGL3-basic and -enhancer vectors. The fragment was generated by PCR using a forward primer that overlapped the last 10 nucleotides of MUC12 gene and a reverse primer overlapping the 11 first nucleotides of MUC17 gene. The corresponding amplification product was 1167 bp long. Four additional constructs were made that comprised the MUC12 last intron (intron 11) (942 bp), MUC12 last exon (exon 12, 360 bp), MUC12 intron 11-exon 12 (1282 bp), and a fragment going from MUC12 intron 11 to the MUC17 5'-untranslated region (2429 bp). As a control, the 775/+57 DNA fragment of the MUC3 promoter reported by Gum et al. was cloned into pGL3-basic and -enhancer vectors and was used in the assays for transcriptional activity (Fig. 6) (19). A 19-fold activation was detected for the 1167-bp intergenic region cloned in the basic vector in AsPC-1 cells, whereas 1.1- and 1.3-fold activations were detected for HPAF and PANC-1 cells, respectively. Hence, it can be inferred that the intergenic region possesses the basic promoter activity, which seems to be cell-specific. Interestingly, the intergenic region showed a very strong enhancer activity, with 300- and 110-fold activations measured for the AsPC-1 and HPAF cells, respectively. No enhancing activity was, however, detected in the PANC-1 cells for this region. The 2429-bp full-length fragment cloned in the basic vector also presented a 15-, 1.76-, and 1.12-fold basic promoter activity in the AsPC-1, HPAF, and PANC-1 cells, respectively. These results confirmed the presence of a promoter within the intergenic fragment. The results obtained using the MUC3 fragment were consistent with those described by Gum et al. (22), with a basic and enhancing activity detected.
Over the years, interest has increased in the study of the different mucin family members. When first investigated, mucins were identified to be associated with respiratory obstruction linked to the common cold or the flu. Now it is clear that the overexpression of mucins (23, 24) as well as the modification of the rheologic properties of the mucus (25) are also responsible for respiratory obstruction in patients with cystic fibrosis (26, 27). Currently, mucins and their implications in numerous disorders are widely being investigated for the development of early diagnostics (28) and/or therapeutics such as vaccines (2932). For instance, the role of mucin members during malignant development and progression is starting to be well documented (3, 4). Mucins, specifically membrane-bound mucins, are thought to act through the tyrosine-kinase receptor to promote proliferation, with MUC1 acting with EGF receptor (33) and MUC4 through HER2 (34). CA125, the marker used to diagnose ovarian cancer, is a membrane-tethered mucin MUC16 (8). The full comprehension of mucin functionality in the development and progression of these diseases and their use for diagnostic and prognostic purposes requires the identification and characterization of the full-length mucin sequences.
One of the membrane-anchored mucins, MUC17, was identified by Gum et al. in 2002 (5). In the present study, the previously known MUC17 sequence was extended toward its 5'-extremity to complete the sequence and localize the promoter and regulatory elements. MUC17 presents the classic architecture of the membrane-bound family members. MUC17 encompasses 13 exons. Its N-terminal domain is coded by 2 exons and possesses a leader sequence followed by a short unique sequence of 34 amino acid residues. Surprisingly, this sequence is not rich in serine, threonine, and proline but contains three cysteine residues. A BLAST search of this sequence using the Prosite recognition domains failed to identify any known functional domain within these 34 residues. MUC17 translation initiation codon is surrounded by a non-consensus Kozak sequence with a C for an A/G in the position 3 (18, 35) and a C for a G in the +4 position. This type of non-consensus Kozak sequence is found in <1% of all known mammalian genes and may be 10-fold less efficient in initiating translation in comparison to the consensus Kozak sequence (36). Other mucins, such as the MUC3A and MUC3B, clustered with MUC17 on chromosome 7q22, also have similar non-consensus Kozak sequences (22). MUC17 contains a large domain composed of at least 60 repetitions of a 59-amino acid residue motif in its central region, which is followed by a sequence of degenerated repeats. The C-terminal sequence is composed, as described by Gum et al. (5), of two EGF-like domains, a SEA (sea-urchin sperm protein, enterokinase, and agrin) domain (37, 38), a transmembrane sequence, and an 81-amino acid residue cytoplasmic tail. A comparison of the MUC17 N-terminal amino acid sequence with the MUC3 and mMuc3 counterparts reveals a high degree of identity between MUC17 and mMuc3 (over 55%). However, no homology was detected between MUC17 and the recently identified MUC3 amino acid sequence (22). At the gene level, MUC3 possesses a single exon at the 5'-extremity, whereas MUC17 and mMuc3 have two exons. Gum et al. (5) reported that the C-terminal sequence of MUC17 is more similar to rat Muc3 and mouse Muc3 than to any other known human protein. Because of the high degree of similarity between both the proteins, and their identical genomic organization, we suggest that rodent Muc3 should be referred to as rat Muc17 and mouse Muc17. We will use the name of Muc17 for clarity reason within the remaining of the text. Additionally, we examined the expression profile of MUC17 in (Fig. 5) by utilizing polyclonal antibodies generated against the tandem-repeat domain. The advantage of choosing a sequence within the repetitive domain is that the antibodies will recognize multiple epitopes on the MUC17 protein resulting in enhanced sensitivity. Conversely, the limitation with such a choice is that the epitope(s) can be masked in vivo due to glycosylation. However, our previous experience with antibody generated against the tandem repeat region of MUC4, which exhibits high sensitivity (39), encouraged us to use the MUC17 tandem repeat peptide as an immunogen. Moreover, several studies have shown that, in cancer conditions, MUC1 and MUC4 are present in underglycosylated forms at the cell surface (40, 41). Furthermore, the anti-tandem repeat MUC antibodies have been shown to recognize mucins in normal physiologic conditions as well (39). It is strongly suggested that rodent Muc17, like its human counterpart, goes through a recycling step to be fully glycosylated. The existence of membrane-bound mucins under two glycoforms (hypo- and fully glycosylated) allows us to use the sequence with the tandem repeat array to design a mucin-specific antibody. Even if the antibodies cannot recognize the mature, fully glycosylated mucins, they should react against the hypoglycosylated forms. In 2001, Khatri et al. (42) showed that rodent Muc17 (or Muc3) was expressed under two forms, a membrane-associated and a soluble forms. They showed that the soluble form was produced after proteolytic cleavage of rodent Muc17 in between the second EGF and the transmembrane domains (42). Furthermore, they reported that the columnar cells predominantly expressed the membrane-bound form, whereas the goblet cells uniquely expressed the soluble form of Muc17. The differential pattern of expression of transmembrane and soluble forms of Muc17 was proposed to result from alternative splicing of Muc17 transcript. Our data are consistent with the proposed hypothesis revealing the transcript for the soluble form of MUC17 (MUC17/SEC), however, MUC17/SEC appears to be expressed at a very low level in the cells and needs further examination. Polycistronic mRNAs are very rare in eukaryotic cells but have been identified, for instance, in humans (43), mice (43), and Drosophila (44). In humans and mice, both UOG-1 and GDF-1 are clustered on the same chromosomal locus, separated by 269 bp (human) and 404 bp (mice) (43). Interestingly, the first exon of MUC17 was located 1146 bp from the last exon of MUC12. Because the gap between MUC12 and MUC17 was of only 1146 bp, PCR was conducted on genomic DNA and cDNA derived from the total RNA in the MUC3, MUC12, MUC17-positive cell line AsPC-1. The expected amplification product was detected in reactions that used genomic DNA; however, no amplification product was detected with cDNAs as templates. This observation ruled out the extreme possibility that MUC12 and MUC17 might be transcribed as a polycistron.
Having shown that both MUC12 and MUC17 genes are transcribed as independent mRNAs, we searched for a consensus promoter with in the 1146-bp intergenic region. No consensus promoter region was identified, and the only sequence recognized as a potential promoter for MUC17 was localized within the last exon of MUC12. The 1146-bp intergenic fragment presented both basic promoter and enhancer activities, whereas the MUC12 intron 11 did not. As depicted in Fig. 6, the MUC17 fragment possessing promoter activity harbors numerous transcription factor binding elements, including GATA, VDR/RXR, Cdx-2, NF MUC12 was first identified by Williams et al. (15) by a differential display as down-regulated in colorectal cancer. MUC12 is strongly expressed in the normal colon but down-regulated in colon cancer. As was the case for MUC17, just the C-terminal sequence of MUC12 is known and presents MUC12 as a membrane-bound protein with two EGF-like domains. Altogether, the ratio of the 7q22 mucin could be an indicator of the physiologic condition of a given tissue, normal versus malignant. In conclusion, our study provides the completed coding sequence for the MUC17 mucin. MUC17 is expressed in two forms, membrane-associated and -secreted. Unlike other MUC genes, there is a very low level of polymorphism in the VNTR region of MUC17. The overexpressed MUC17 is detected in pancreatic cancer cells compared with the normal pancreas. In the future, the regulation of MUC17 expression will be explored for investigating its significance in cancer and under normal conditions.
* This work was supported by National Institutes of Health (NIH) Grants CA111294, CA72712, CA78590, and P20RR16469, by Nebraska Department of Health and Human Services Grant 2005-16, and by NCI, NIH Cancer Center Grant P30 CA36727 to the University of Nebraska Medical Center. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the Gen-BankTM/EBI Data Bank with accession number(s) AJ606308
[GenBank]
.
1 Both authors contributed equally to this work. 2 To whom correspondence should be addressed: Dept. of Biochemistry and Molecular Biology, Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, 985870 Nebraska Medical Center, Omaha, NE 68198-4525. Tel.: 402-559-5455; Fax: 402-559-6650; E-mail: sbatra{at}unmc.edu.
3 The abbreviations used are: MUC, mucin; EGF, epidermal growth factor; RACE, rapid amplification of cDNA ends; RT, reverse transcription; TR, tandem repeat; VNTR, variable number of tandem repeats.
We thank Erik Moore and Allison Ruhde for excellent technical support and Kristi Berger for editing the manuscript. We also thank the director of the Tissue Bank at University of Nebraska Medical Center, Dr. Julia Bridge, for providing tissue samples. The Comparative Human Tissue Network, Western Division, Case Western Reserve University, Columbus, OH, is greatly acknowledged for providing the additional tissues for this study.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||