You are working on Singularity: singularity_proj_encode_fcc
BASE DIRECTORY (FD_BASE): /data/reddylab/Kuei
REPO DIRECTORY (FD_REPO): /data/reddylab/Kuei/repo
WORK DIRECTORY (FD_WORK): /data/reddylab/Kuei/work
DATA DIRECTORY (FD_DATA): /data/reddylab/Kuei/data
You are working with ENCODE FCC
PATH OF PROJECT (FD_PRJ): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC
PROJECT RESULTS (FD_RES): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results
PROJECT SCRIPTS (FD_EXE): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/scripts
PROJECT DATA (FD_DAT): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data
PROJECT NOTE (FD_NBK): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/notebooks
PROJECT DOCS (FD_DOC): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/docs
PROJECT LOG (FD_LOG): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/log
PROJECT REF (FD_REF): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/references
Set global variables
Code
TXT_FOLDER_REGION ="encode_chromatin_states"
Import data
Check data
Code
txt_fdiry =file.path(FD_DAT, "external", TXT_FOLDER_REGION)vec =dir(txt_fdiry)for (txt in vec){cat(txt, "\n")}
dat = dat_region_ccresdat = dat %>% dplyr::mutate(Name = Label) %>% dplyr::arrange(Chrom, ChromStart, ChromEnd)dat_region_ccres_label2name = datfun_display_table(head(dat))
Chrom
ChromStart
ChromEnd
Name
Score
Strand
ThickStart
ThickEnd
ItemRgb
Label
Note
chr1
10033
10250
Low-DNase
0
.
10033
10250
225,225,225
Low-DNase
All-data/Full-classification
chr1
10385
10713
Low-DNase
0
.
10385
10713
225,225,225
Low-DNase
All-data/Full-classification
chr1
16097
16381
Low-DNase
0
.
16097
16381
225,225,225
Low-DNase
All-data/Full-classification
chr1
17343
17642
Low-DNase
0
.
17343
17642
225,225,225
Low-DNase
All-data/Full-classification
chr1
29320
29517
Low-DNase
0
.
29320
29517
225,225,225
Low-DNase
All-data/Full-classification
chr1
66350
66509
Low-DNase
0
.
66350
66509
225,225,225
Low-DNase
All-data/Full-classification
Extract PLS/ELS only
Code
dat = dat_region_ccres_label2namevec =c("PLS", "pELS", "dELS")dat = dat %>% dplyr::filter(Label %in% vec) %>% dplyr::arrange(Chrom, ChromStart, ChromEnd)dat_region_ccres_label2name_subset = datfun_display_table(head(dat))
Chrom
ChromStart
ChromEnd
Name
Score
Strand
ThickStart
ThickEnd
ItemRgb
Label
Note
chr1
138917
139112
pELS
0
.
138917
139112
255,167,0
pELS
All-data/Full-classification
chr1
778570
778919
PLS
0
.
778570
778919
255,0,0
PLS
All-data/Full-classification
chr1
779023
779182
PLS
0
.
779023
779182
255,0,0
PLS
All-data/Full-classification
chr1
825846
826068
pELS
0
.
825846
826068
255,167,0
pELS
All-data/Full-classification
chr1
826734
826887
PLS
0
.
826734
826887
255,0,0
PLS
All-data/Full-classification
chr1
827417
827767
PLS
0
.
827417
827767
255,0,0
PLS
All-data/Full-classification
Define column description
The peak file is in narrowPeak format, which is a standard six field bed with four additional fields (BED6+4 format)
Code
### create metadata: column informationdat =tribble(~Name, ~Note,"Chrom", "Name of the chromosome","ChromStart", "The starting position of the feature in the chromosome","ChromEnd", "The ending position of the feature in the chromosome","Name", "Name given to a region; Use '.' if no name is assigned.","Group", "Type of chromatin states annotaiton","Label", "cCREs/ChromHMM labels")### assign and showdat_cname = datfun_display_table(dat)
Name
Note
Chrom
Name of the chromosome
ChromStart
The starting position of the feature in the chromosome
ChromEnd
The ending position of the feature in the chromosome
Name
Name given to a region; Use '.' if no name is assigned.