Welcome to CRISPRBank, the current version of this database contains analysis of all the genomes from RefSeq 95 July 2019. All 151,845 bacterial and 855 archaeal genomes were analysed using CRISPRDetect 2.4 (also availble through github) . These included those marked as complete or lower levels of assembly (e.g. scaffold or contig). CRISPRDetect and CRISPRbank are part of CRISPRSuite described here:
To cite this data use: Biswas A, Staals RH, Morales SE, Fineran PC, Brown CM: CRISPRDetect: A flexible algorithm to define CRISPR arrays. BMC Genomics 2016, 17(1):356.
Other recent analyses of the CRISPR array and Cas gene content of complete genome sequences can be found at the follwing links:
CRISPRCasdb (16,990 complete Genbank genomes 12/6/2019)
Makarova et al 2019, CRISPRclass19 (13,116 complete genomes 1/3/2019). We intend to include a comparison of this data to our CRISPRBank data in a later release of this interface (3/2020)
|Archaeal or Bacterial genomes with CRISPRDetect defined repeats >=3, score>=4.0 and DR length>=23.
|Kingdom||CRISPR array containing genomes of all RefSeq genomes (%)||Number of CRISPR arrays||Number of CRISPR spacers|
|Archaea||699 of 855 (81.8%)||2086||71252|
|Bacteria||70531 of 151845 (46.4%)||130293||1923258|
Note: This table only considers arrays, some array containing genomes (5-10%) will lack functional cas genes. 2. Each RefSeq assembly (GCF..) is counted, there are many assemblies for some species e.g. E coli.