In this analysis, we are using the fragment counts data generated and processed from the Reddy lab. The ASTARR in K562 is designed and done by Keith and WSTARR in K562 is generated by Kari. For more information, check the data dictionary page.
Set environment
Code
source ../run_config_project.sh
show_env
You are working on Duke Server: HARDAC
BASE DIRECTORY (FD_BASE): /data/reddylab/Kuei
REPO DIRECTORY (FD_REPO): /data/reddylab/Kuei/repo
WORK DIRECTORY (FD_WORK): /data/reddylab/Kuei/work
DATA DIRECTORY (FD_DATA): /data/reddylab/Kuei/data
CONTAINER DIR. (FD_SING): /data/reddylab/Kuei/container
You are working with ENCODE FCC
PATH OF PROJECT (FD_PRJ): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC
PROJECT RESULTS (FD_RES): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results
PROJECT SCRIPTS (FD_EXE): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/scripts
PROJECT DATA (FD_DAT): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data
PROJECT NOTE (FD_NBK): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/notebooks
PROJECT DOCS (FD_DOC): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/docs
PROJECT LOG (FD_LOG): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/log
PROJECT APP (FD_APP): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app
PROJECT REF (FD_REF): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/references
PROJECT IMAGE (FP_PRJ_SIF): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/app/singularity_proj_encode_fcc.sif
Check existence of STARR data
WSTARR
Code
fragments metadata motifs peaks processed_raw_reads qc raw_reads
Code
echo ${FD_WGS_WSTARR_FRAGS}
echo
for FPATH in ${FP_WGS_WSTARR_FRAGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-input-K562-rep1.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep2.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep3.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep4.masked.dedup.fragments.counts.txt.gz
A001-K562-rep1.masked.dedup.fragments.counts.txt.gz
A001-K562-rep2.masked.dedup.fragments.counts.txt.gz
A001-K562-rep3.masked.dedup.fragments.counts.txt.gz
Code
echo ${FD_WGS_WSTARR_INP_BAM}
echo
for FPATH in ${FP_WGS_WSTARR_INP_BWIGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
/data/reddylab/kstrouse/superstarr/input_libs/A001/nextseq/processing/starr_seq/A001_nextseq-pe
rep1.f3q10.sorted.dedup.rpkm.bw
rep2.f3q10.sorted.dedup.rpkm.bw
rep3.f3q10.sorted.dedup.rpkm.bw
rep4.f3q10.sorted.dedup.rpkm.bw
Code
echo ${FD_WGS_WSTARR_OUT_BAM_rep01}
echo ${FD_WGS_WSTARR_OUT_BAM_rep23}
echo
for FPATH in ${FP_WGS_WSTARR_OUT_BWIGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
/data/reddylab/kstrouse/superstarr/output_libs/A001_K562/A001_K562_20201124/combined_reads/processing/starr_seq/A001_K562_20201124_combined-pe
/data/reddylab/kstrouse/superstarr/output_libs/A001_K562/A001_K562_20210213/processing/starr_seq/Strouse_6825_210223A5-pe
A001-K562-rep1.f3q10.sorted.dedup.rpkm.bw
A001-K562-rep2.f3q10.sorted.dedup.rpkm.bw
A001-K562-rep3.f3q10.sorted.dedup.rpkm.bw
ASTARR (KS91)
Code
echo ${FD_WGS_ASTARR_KS91_INP}
echo ${FD_WGS_ASTARR_KS91_OUT}
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
Code
for FPATH in ${FP_WGS_ASTARR_KS91_FRAGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.txt.gz
Code
for FPATH in ${FP_WGS_ASTARR_KS91_FRAGS [@] } ; do
ls -l ${FPATH}
done
-rw-r--r-- 1 aeb84 reddylab 3146501530 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep1.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 3995769876 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep2.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 4291888489 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep3.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 4037119129 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep4.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 3938483893 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep5.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 3550102461 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2/KS91_K562_hg38_ASTARRseq_Input_rep6.masked.dedup.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 320000484 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 588618835 Apr 4 13:26 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.corrected.txt.gz
-rw-r--r-- 1 aeb84 reddylab 595652591 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 672333423 Apr 4 13:29 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.corrected.txt.gz
-rw-r--r-- 1 aeb84 reddylab 672333423 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.txt.gz
-rw-r--r-- 1 aeb84 reddylab 1096240707 Apr 4 14:08 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.corrected.txt.gz
-rw-r--r-- 1 aeb84 reddylab 1107286720 Jul 7 2022 /data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis/KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.txt.gz
Code
for FPATH in ${FP_WGS_ASTARR_KS91_BWIGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep5.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep6.f3q10.sorted.with_umis.dedup.cpm.bw
Code
for FPATH in ${FP_WGS_ASTARR_KS91 [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep5.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep6.f3q10.sorted.with_umis.dedup.cpm.bw
ASTARR (KS274)
Code
echo ${FD_WGS_ASTARR_KS274_OUT}
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
Code
for FPATH in ${FP_WGS_ASTARR_KS274_OUT_FRAGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
K562_ASTARR_repeat_rep1.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep2.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep3.f3q10.fragments.bedpe
Code
for FPATH in ${FP_WGS_ASTARR_KS274_OUT_BWIGS [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
K562_ASTARR_repeat_rep1.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep2.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep3.f3q10.sorted.with_umis.dedup.rpkm.bw
Code
for FPATH in ${FP_WGS_ASTARR_KS274 [@] } ; do
ls ${FPATH} | xargs -n 1 basename
done
K562_ASTARR_repeat_rep1.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep2.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep3.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep1.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep2.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep3.f3q10.sorted.with_umis.dedup.rpkm.bw
Set data directories and copy the final processed data of STARR-seq
Create data directories
PROJECT/data/processed
├── STARR_ATAC_K562_Reddy_KS91_210401
│ ├── fragments
│ └── peaks
│
├── STARR_ATAC_K562_Reddy_KS274_240311
│ └── fragments
│
├── STARR_WHG_K562_Reddy_A001_Alex
│ └── fragments
│
└── STARR_WHG_K562_Reddy_A001_Kari
└── superstarr
├── input_libs
│ └── A001_K562
└── output_libs
├── A001_K562_20201124
└── A001_K562_20210213
Code
mkdir -p ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/fragments
mkdir -p ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/peaks
mkdir -p ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS274_240311/fragments
mkdir -p ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Alex/fragments
FDIRY = ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari/superstarr
mkdir -p ${FDIRY} /input_libs
mkdir -p ${FDIRY} /output_libs
Code
ls -1 ${FD_DAT} /processed/STARR*
ls -1 ${FD_DAT} /processed/STARR* /superstarr
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS274_240311:
fragments
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS91_210401:
fragments
peaks
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Alex:
fragments
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Kari:
superstarr
input_libs
output_libs
Copy data files
Copy WSTARR fragment counts
Code
FD_OUT = ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Alex/fragments
for FPATH in ${FP_WGS_WSTARR_FRAGS [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-input-K562-rep1.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-input-K562-rep2.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-input-K562-rep3.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-input-K562-rep4.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-K562-rep1.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-K562-rep2.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/data/starr_seq/fragments
A001-K562-rep3.masked.dedup.fragments.counts.txt.gz
Copy WSTARR bigwigs
Code
FD_OUT = ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari/superstarr/input_libs
for FPATH in ${FP_WGS_WSTARR_INP_BWIGS [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/kstrouse/superstarr/input_libs/A001/nextseq/processing/starr_seq/A001_nextseq-pe
rep1.f3q10.sorted.dedup.rpkm.bw
/data/reddylab/kstrouse/superstarr/input_libs/A001/nextseq/processing/starr_seq/A001_nextseq-pe
rep2.f3q10.sorted.dedup.rpkm.bw
/data/reddylab/kstrouse/superstarr/input_libs/A001/nextseq/processing/starr_seq/A001_nextseq-pe
rep3.f3q10.sorted.dedup.rpkm.bw
/data/reddylab/kstrouse/superstarr/input_libs/A001/nextseq/processing/starr_seq/A001_nextseq-pe
rep4.f3q10.sorted.dedup.rpkm.bw
Code
FD_OUT = ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari/superstarr/output_libs
for FPATH in ${FP_WGS_WSTARR_OUT_BWIGS [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/kstrouse/superstarr/output_libs/A001_K562/A001_K562_20201124/combined_reads/processing/starr_seq/A001_K562_20201124_combined-pe
A001-K562-rep1.f3q10.sorted.dedup.rpkm.bw
/data/reddylab/kstrouse/superstarr/output_libs/A001_K562/A001_K562_20210213/processing/starr_seq/Strouse_6825_210223A5-pe
A001-K562-rep2.f3q10.sorted.dedup.rpkm.bw
/data/reddylab/kstrouse/superstarr/output_libs/A001_K562/A001_K562_20210213/processing/starr_seq/Strouse_6825_210223A5-pe
A001-K562-rep3.f3q10.sorted.dedup.rpkm.bw
Copy ASTARR fragment counts
Code
FD_OUT = ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/fragments
for FPATH in ${FP_WGS_ASTARR_KS91 [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.dedup.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.corrected.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.corrected.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.corrected.txt.gz
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.txt.gz
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal/merged2
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.exclude_dups.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.sorted.with_umis.dedup.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.sorted.with_umis.dedup.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep5.f3q10.sorted.with_umis.dedup.cpm.bw
/data/reddylab/Alex/encode4_duke/processing/starr_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-umis
KS91_K562_hg38_ASTARRseq_Output_rep6.f3q10.sorted.with_umis.dedup.cpm.bw
Code
FD_OUT = ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS274_240311/fragments
for FPATH in ${FP_WGS_ASTARR_KS274 [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep1.f3q10.fragments.bedpe
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep2.f3q10.fragments.bedpe
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep3.f3q10.fragments.bedpe
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep1.f3q10.sorted.with_umis.dedup.rpkm.bw
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep2.f3q10.sorted.with_umis.dedup.rpkm.bw
/data/reddylab/Keith/encode4_duke/processing/starr_seq/240311_KS274_ASTARR_Output_Nextseq-pe-umis
K562_ASTARR_repeat_rep3.f3q10.sorted.with_umis.dedup.rpkm.bw
Copy ASTARR peak calls
Code
FD_OUT = ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/peaks
for FPATH in ${FP_WGS_ASTARR_KS91_INP_PEAKS [@] } ; do
FDIRY = $( dirname ${FPATH})
FNAME = $( basename ${FPATH})
cp ${FPATH} ${FD_OUT} /${FNAME}
echo ${FDIRY}
echo ${FNAME}
echo
done
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal
KS91_K562_hg38_ASTARRseq_Input.all_reps.masked.union_narrowPeak.q5.bed
/data/reddylab/Alex/encode4_duke/processing/atac_seq/210401_KS91_K562ASTARR_NovaSeq.hg38-pe-blacklist-removal
KS91_K562_hg38_ASTARRseq_Input.q5.in_all.max_overlaps.bed
Check results
Check if the data is copied correctly.
Code
ls -d ${FD_DAT} /processed/STARR*
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS274_240311
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS91_210401
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Alex
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Kari
Code
ls -1 ${FD_DAT} /processed/STARR*
ls -1 ${FD_DAT} /processed/STARR* /superstarr
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS274_240311:
fragments
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS91_210401:
fragments
peaks
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Alex:
fragments
/data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Kari:
superstarr
input_libs
output_libs
Code
ls -1 ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Alex/fragments
A001-input-K562-rep1.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep2.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep3.masked.dedup.fragments.counts.txt.gz
A001-input-K562-rep4.masked.dedup.fragments.counts.txt.gz
A001-K562-rep1.masked.dedup.fragments.counts.txt.gz
A001-K562-rep2.masked.dedup.fragments.counts.txt.gz
A001-K562-rep3.masked.dedup.fragments.counts.txt.gz
Code
ls -1 ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari/superstarr/input_libs
rep1.f3q10.sorted.dedup.rpkm.bw
rep2.f3q10.sorted.dedup.rpkm.bw
rep3.f3q10.sorted.dedup.rpkm.bw
rep4.f3q10.sorted.dedup.rpkm.bw
Code
ls -1 ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari/superstarr/output_libs
A001-K562-rep1.f3q10.sorted.dedup.rpkm.bw
A001-K562-rep2.f3q10.sorted.dedup.rpkm.bw
A001-K562-rep3.f3q10.sorted.dedup.rpkm.bw
Code
ls -1 ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/fragments
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep1.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep2.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep3.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep4.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep5.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.dedup.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Input_rep6.masked.exclude_dups.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep1.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep2.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep3.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.corrected.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.fragments.counts.txt.gz
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.corrected.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep4.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep5.f3q10.sorted.with_umis.dedup.cpm.bw
KS91_K562_hg38_ASTARRseq_Output_rep6.f3q10.sorted.with_umis.dedup.cpm.bw
Code
ls -1 ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/peaks
KS91_K562_hg38_ASTARRseq_Input.all_reps.masked.union_narrowPeak.q5.bed
KS91_K562_hg38_ASTARRseq_Input.q5.in_all.max_overlaps.bed
Code
ls -1 ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS274_240311/fragments
K562_ASTARR_repeat_rep1.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep1.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep2.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep2.f3q10.sorted.with_umis.dedup.rpkm.bw
K562_ASTARR_repeat_rep3.f3q10.fragments.bedpe
K562_ASTARR_repeat_rep3.f3q10.sorted.with_umis.dedup.rpkm.bw
Check folder size
Code
du -sh ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Alex
9.8G /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Alex
Code
du -sh ${FD_DAT} /processed/STARR_WHG_K562_Reddy_A001_Kari
8.4G /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_WHG_K562_Reddy_A001_Kari
Code
du -sh ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/fragments
44G /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS91_210401/fragments
Code
du -sh ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS91_210401/peaks
4.1M /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS91_210401/peaks
Code
du -sh ${FD_DAT} /processed/STARR_ATAC_K562_Reddy_KS274_240311
5.5G /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data/processed/STARR_ATAC_K562_Reddy_KS274_240311