<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>cran-files | B101nfo</title>
    <link>https://llrs.dev/tags/cran-files/</link>
      <atom:link href="https://llrs.dev/tags/cran-files/index.xml" rel="self" type="application/rss+xml" />
    <description>cran-files</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>If it is code you can copy and reuse (MIT) if it is text, please cite and reuse CC-BY 2024.</copyright><lastBuildDate>Thu, 28 Jul 2022 00:00:00 +0000</lastBuildDate>
    <image>
      <url>img/map[gravatar:%!s(bool=false) shape:circle]</url>
      <title>cran-files</title>
      <link>https://llrs.dev/tags/cran-files/</link>
    </image>
    
    <item>
      <title>Exploring CRAN&#39;s files: part 2</title>
      <link>https://llrs.dev/post/2022/07/28/cran-files-2/</link>
      <pubDate>Thu, 28 Jul 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/07/28/cran-files-2/</guid>
      <description>


&lt;div id=&#34;introduction&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/&#34;&gt;first post&lt;/a&gt; of the series we briefly explored packages available on CRAN.
Now I’ll focus on history of the packages and its size using the following files:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;packages &amp;lt;- tools::CRAN_package_db()
current &amp;lt;- tools:::CRAN_current_db()
archive &amp;lt;- tools:::CRAN_archive_db()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this part we will use two files: The &lt;code&gt;current&lt;/code&gt; and the &lt;code&gt;archive&lt;/code&gt;, let’s see why.&lt;/p&gt;
&lt;div id=&#34;current-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;current file&lt;/h3&gt;
&lt;p&gt;The current database has has the package size, dates of modification, which I assume is date added to CRAN and user name of who last modified it.
This is the same information returned by &lt;a href=&#34;https://search.r-project.org/R/refmans/base/html/file.info.html&#34;&gt;&lt;code&gt;file.info&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;current[1, 1:10]
##     size isdir mode               mtime               ctime               atime
## A3 42810 FALSE  664 2015-08-16 23:05:54 2022-09-03 12:02:27 2022-09-03 14:00:19
##     uid  gid  uname    grname
## A3 1001 1001 hornik cranadmin&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;archive-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;archive file&lt;/h3&gt;
&lt;p&gt;The archive database returns the same information, but as you might guess by the name it doesn’t provide information about current packages but for packages in the archive and no longer available by default.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;archive[[1]]
##                     size isdir mode               mtime               ctime
## A3/A3_0.9.1.tar.gz 45252 FALSE  664 2013-02-07 10:00:29 2022-08-22 18:14:53
## A3/A3_0.9.2.tar.gz 45907 FALSE  664 2013-03-26 19:58:40 2022-08-22 18:14:53
##                                  atime  uid  gid  uname    grname
## A3/A3_0.9.1.tar.gz 2022-08-22 17:39:50 1001 1001 hornik cranadmin
## A3/A3_0.9.2.tar.gz 2022-08-22 17:39:50 1010 1001 ligges cranadmin&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The date matches that available on the &lt;a href=&#34;https://cran.r-project.org/src/contrib/Archive/A3/&#34;&gt;web’s old sources&lt;/a&gt;, so we can be confident of it’s meaning.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;cran-history&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;CRAN history&lt;/h2&gt;
&lt;p&gt;As we have seen there are some files about the archives of CRAN.
These include information about date of modification (moving/editing) and user who did it and of course name and sometimes version of the package.
These archives are the great treasure of CRAN because they help to make reproducible long time ago run experiments or analysis.&lt;/p&gt;
&lt;p&gt;Note that I’m not totally sure that this archive contains the full record of packages, some initial packages might be missing.
I’m also aware of some packages removed by CRAN which do not longer appear on this records.&lt;/p&gt;
&lt;p&gt;Nevertheless, this should provide an accurate picture of packages available through time.
Also as there is no information when a package is archived (here, &lt;a href=&#34;https://llrs.dev/post/2021/12/07/reasons-cran-archivals/&#34;&gt;there is on PACKAGES.in&lt;/a&gt;) so I might overestimate the packages available at any given moment.&lt;/p&gt;
&lt;p&gt;Remember the plot about &lt;a href=&#34;#accepted&#34;&gt;acceptance of packages on CRAN?&lt;/a&gt;
That plot only looked at current packages available, let’s check it with all the archive:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:accumulative-packages&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/accumulative-packages-1.png&#34; alt=&#34;*Packages on CRAN archive by their addition to it.* There are over 125000 archives on CRAN.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: &lt;em&gt;Packages on CRAN archive by their addition to it.&lt;/em&gt; There are over 125000 archives on CRAN.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;All these packages come from packages with few releases and packages with many releases.
If we look at which packages had the most releases:&lt;/p&gt;
&lt;template id=&#34;41fb6fac-ce02-4889-ac51-217e365f4058&#34;&gt;&lt;style&gt;
.tabwid table{
  border-spacing:0px !important;
  border-collapse:collapse;
  line-height:1;
  margin-left:auto;
  margin-right:auto;
  border-width: 0;
  display: table;
  margin-top: 1.275em;
  margin-bottom: 1.275em;
  border-color: transparent;
}
.tabwid_left table{
  margin-left:0;
}
.tabwid_right table{
  margin-right:0;
}
.tabwid td {
    padding: 0;
}
.tabwid a {
  text-decoration: none;
}
.tabwid thead {
    background-color: transparent;
}
.tabwid tfoot {
    background-color: transparent;
}
.tabwid table tr {
background-color: transparent;
}
.katex-display {
    margin: 0 0 !important;
}
&lt;/style&gt;&lt;div class=&#34;tabwid&#34;&gt;&lt;style&gt;.cl-e305f260{}.cl-e2fc13c6{font-family:&#39;DejaVu Sans&#39;;font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-e2fc2fdc{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-e2fc2fe6{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-e2fc7a46{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a5a{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a64{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a6e{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a6f{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a82{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a8c{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a96{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a97{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aa0{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aa1{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aaa{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aab{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7ab4{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}&lt;/style&gt;&lt;table class=&#39;cl-e305f260&#39;&gt;
&lt;thead&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7aab&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;package&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7ab4&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Releases&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;spatstat&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;206&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Matrix&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;204&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a6f&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;mgcv&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a82&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;162&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;RcppArmadillo&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;150&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;rgdal&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;146&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;nlme&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;143&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;caret&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;139&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;spdep&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;139&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;lattice&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;137&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;plotrix&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;131&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a6f&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;sp&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a82&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;128&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;XML&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;126&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Rcmdr&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;123&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;lme4&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;122&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;gstat&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;121&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;arm&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;119&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;foreign&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;117&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;party&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;117&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;maptools&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;113&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7aa1&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;raster&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aaa&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;108&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;/template&gt;
&lt;div class=&#34;flextable-shadow-host&#34; id=&#34;c207439a-5643-4e95-950e-721182ef54dd&#34;&gt;&lt;/div&gt;
&lt;script&gt;
var dest = document.getElementById(&#34;c207439a-5643-4e95-950e-721182ef54dd&#34;);
var template = document.getElementById(&#34;41fb6fac-ce02-4889-ac51-217e365f4058&#34;);
var caption = template.content.querySelector(&#34;caption&#34;);
if(caption) {
  caption.style.cssText = &#34;display:block;text-align:center;&#34;;
  var newcapt = document.createElement(&#34;p&#34;);
  newcapt.appendChild(caption)
  dest.parentNode.insertBefore(newcapt, dest.previousSibling);
}
var fantome = dest.attachShadow({mode: &#39;open&#39;});
var templateContent = template.content;
fantome.appendChild(templateContent);
&lt;/script&gt;

&lt;p&gt;Surprisingly there are packages with more than 200 versions on CRAN!&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-distribution&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-distribution-1.png&#34; alt=&#34;*Releases distirbution*. Packages and number of releases&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: &lt;em&gt;Releases distirbution&lt;/em&gt;. Packages and number of releases
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Most packages have 1 release, usually packages have 3, but the mean is around 6.&lt;/p&gt;
&lt;p&gt;Given all this different versions of packages how big are all the packages on CRAN?&lt;/p&gt;
&lt;div id=&#34;cran-size&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CRAN size&lt;/h3&gt;
&lt;p&gt;Have you ever wondered how big is CRAN? According to the memory size of the source packages all CRAN source packages are approximately 96.8 Gb.&lt;/p&gt;
&lt;p&gt;This doesn’t include binaries for multiple architectures and OS.
The package size might indicate whether the package has considerable amount of data.&lt;/p&gt;
&lt;p&gt;Looking back to the size of the packages along time we can see this pattern:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:packages-size&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/packages-size-1.png&#34; alt=&#34;*Package and their median size.* Archived packages have become bigger since 2014. Packages on CRAN have been getting bigger since 2017.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 3: &lt;em&gt;Package and their median size.&lt;/em&gt; Archived packages have become bigger since 2014. Packages on CRAN have been getting bigger since 2017.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Packages available on CRAN are smaller than those no longer on CRAN.
But versions of packages on CRAN that got archived are usually bigger than current versions.
Packages no longer on CRAN are usually bigger.
Median size of packages is increasing (quickly).&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-size&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-size-1.png&#34; alt=&#34;*Size of package with releases.* Package are usually small but seem to gain weight when updating.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 4: &lt;em&gt;Size of package with releases.&lt;/em&gt; Package are usually small but seem to gain weight when updating.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Typically packages increase their size with each new release up to when they reach 50 releases.
For higher releases this plot depends on very few packages and might not be representative.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-size2&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-size2-1.png&#34; alt=&#34;*Size of package with releases by availability.* Packages no longer in CRAN are usually smaller than those in it. The continous black line is CRAN&#39;s current threshold, while the discontinous black line is current median size.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 5: &lt;em&gt;Size of package with releases by availability.&lt;/em&gt; Packages no longer in CRAN are usually smaller than those in it. The continous black line is CRAN’s current threshold, while the discontinous black line is current median size.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Here we can appreciate better how packages tend to be below the CRAN threshold.
There isn’t much of a difference between packages available on CRAN and those archived.&lt;/p&gt;
&lt;p&gt;If we look at the size of package of the first release over time we’ll see a representative view:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:size-time&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/size-time-1.png&#34; alt=&#34;*Size of the first release by time*. Package size increases with time with a peak around 2010 and increasing again since 2014 but still hasn&#39;t surprased the previous record.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 6: &lt;em&gt;Size of the first release by time&lt;/em&gt;. Package size increases with time with a peak around 2010 and increasing again since 2014 but still hasn’t surprased the previous record.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Package size tends to increase except for the brief period 2010-2014.
Currently it increases less than before that period but is close to its maximum.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusions&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Most packages are not updated too much, between 1 and 3 times.
But there are packages that are updated quite a lot, this might mean they are data packages and not software packages or that they have frequent minor and major updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most current packages have smaller size than those archived.
Packages no longer available usually had bigger size than those packages still on CRAN.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Surprisingly packages increase their size a lot till the 25 release.
But also with time except for a period in 2010 and 2014.
This decreasing period might be due to a change in CRAN policy.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;future-parts&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Future parts&lt;/h2&gt;
&lt;p&gt;On future posts I’ll explore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;patterns accepting packages and updates in packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the relation between dependencies, initial release and updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;who handled the packages.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Exploring CRAN&#39;s files: part 1</title>
      <link>https://llrs.dev/post/2022/07/23/cran-files-1/</link>
      <pubDate>Sat, 23 Jul 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/07/23/cran-files-1/</guid>
      <description>


&lt;div id=&#34;introduction&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;There are many great things in base R, one of them is the &lt;a href=&#34;https://search.r-project.org/R/refmans/tools/html/00Index.html&#34;&gt;tools package&lt;/a&gt;.
This package has the functions that are used to build, check and create packages, documentation and manuals.&lt;/p&gt;
&lt;p&gt;As I wanted to know how CRAN works and its changes I was looking into the source code of tools.
I found some internal functions that access freely available files with information about CRAN packages.
These private functions are at the &lt;a href=&#34;https://svn.r-project.org/R/trunk/src/library/tools/R/CRANtools.R&#34;&gt;CRANtools.R file&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;packages &amp;lt;- tools::CRAN_package_db()
# current &amp;lt;- tools:::CRAN_current_db()
# archive &amp;lt;- tools:::CRAN_archive_db()
# issues &amp;lt;- tools::CRAN_check_issues()
# alias &amp;lt;- tools:::CRAN_aliases_db()
# rdxrefs &amp;lt;- tools:::CRAN_rdxrefs_db()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As I was not sure of the information on these files I asked on &lt;a href=&#34;https://stat.ethz.ch/pipermail/r-devel/2022-May/081770.html&#34;&gt;R-devel&lt;/a&gt; but I did not receive an answer.
They seem to be quite obscure and as private functions might be removed without notice and shouldn’t be used in any dependency.
However, as the files contain information about CRAN they might provide interesting clues about the history of CRAN and how it is operated.&lt;/p&gt;
&lt;p&gt;On this post I will focus on the first file.
I’ll explore a couple of fields and in future posts I will use the other files to explore more about CRAN history.&lt;/p&gt;
&lt;div id=&#34;packages-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;packages file&lt;/h3&gt;
&lt;p&gt;First of all a very brief exploration of what is in this file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;##    Package Version Priority                        Depends
## 1       A3   1.0.0     &amp;lt;NA&amp;gt; R (&amp;gt;= 2.15.0), xtable, pbapply
## 2 AATtools   0.0.1     &amp;lt;NA&amp;gt;                   R (&amp;gt;= 3.6.0)
## 3   ABACUS   1.0.0     &amp;lt;NA&amp;gt;                   R (&amp;gt;= 3.1.0)
##                                 Imports LinkingTo
## 1                                  &amp;lt;NA&amp;gt;      &amp;lt;NA&amp;gt;
## 2  magrittr, dplyr, doParallel, foreach      &amp;lt;NA&amp;gt;
## 3 ggplot2 (&amp;gt;= 3.1.0), shiny (&amp;gt;= 1.3.1),      &amp;lt;NA&amp;gt;
##                               Suggests Enhances    License License_is_FOSS
## 1                  randomForest, e1071     &amp;lt;NA&amp;gt; GPL (&amp;gt;= 2)            &amp;lt;NA&amp;gt;
## 2                                 &amp;lt;NA&amp;gt;     &amp;lt;NA&amp;gt;      GPL-3            &amp;lt;NA&amp;gt;
## 3 rmarkdown (&amp;gt;= 1.13), knitr (&amp;gt;= 1.22)     &amp;lt;NA&amp;gt;      GPL-3            &amp;lt;NA&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Packages has similar information as &lt;code&gt;available.packages()&lt;/code&gt; but with many more columns with published date, reverse dependencies, X-CRAN-Comment, who packaged it…
Also note that all this packages are not filtered to match R version, OS_type, subarch and there are almost duplicates (I learned about this filtering while reading the great documentation of &lt;a href=&#34;https://search.r-project.org/R/refmans/utils/html/available.packages.html&#34;&gt;&lt;code&gt;available.packages()&lt;/code&gt;&lt;/a&gt; and also finding some mentions online).&lt;/p&gt;
&lt;p&gt;As we have data from several years I’ll sometimes show the release dates of different R versions to provide some context.
Without further delay let’s explore the data!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;accepted&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Published packages&lt;/h2&gt;
&lt;p&gt;CRAN started some time ago (in 1997) but it hasn’t remained frozen.
The package archive (the A in CRAN) has been updating since then.
For instance the current packages do not include packages that were removed, archived or those replaced by updates.&lt;/p&gt;
&lt;p&gt;First packages are submitted to CRAN and once accepted they are published.
As accepted and published usually are almost instantaneous I might use them as synonyms.
Looking at the current available packages and their publication date, we can see the following:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:daily-cran&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/daily-cran-1.png&#34; alt=&#34;ggplot2 plot of date vs packages accepted on a given day. Until2020 less than 10 packages were accepted daily. Lately more than 30 are added to CRAN. The plot also displays the R release versions from 2.12 in 2010 to 4.2.0 in 2022.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: &lt;em&gt;Packages accepted on CRAN by the publication date.&lt;/em&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The oldest package added was in 2010.
This means a package without issues, dependencies changes, bugs detected by the automatic checks since 12 years!&lt;/p&gt;
&lt;p&gt;The daily rate of acceptance has increased from less than 10 a day till 2020 to more than 30 this year 2022.
If we summarize that information for month we see the same, but the little bump in 2020 disappears but we see other patterns:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:monthly-cran&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/monthly-cran-1.png&#34; alt=&#34;ggplot figure with the monthly published packages. till 2015 it raises very slowly, then in is around 50 monthly packages and there are some wobbles. In 2022 it raised to over 800 packages.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: &lt;em&gt;Monthly packages published to CRAN&lt;/em&gt;. Some monthly variance is observed.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Instead of just one bump we see some waves with less packages on CRAN accepted late in the year and an increase of packages the first months of the year.&lt;/p&gt;
&lt;p&gt;If we look at the accumulated packages on CRAN we see an exponential growth:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-cumsum&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-cumsum-1.png&#34; alt=&#34;Plot with the accumulative number of packages in CRAN. Raising from a few 10 to currently more than 18000.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 3: &lt;em&gt;Acumulation of packages&lt;/em&gt;. Most of the packages have been published in the last 2 years.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In fact, most packages currently on CRAN where added since March 2021 than all the previous years.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-perc&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-perc-1.png&#34; alt=&#34;Line with percentages of packages in CRAN by date. Close to 50% of current packages were published between 2010 and 2021.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 4: &lt;em&gt;Percentage of current packages on CRAN according to their date of publication&lt;/em&gt;. Most of them were published/updated on the last year and a half.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This is a good time to remind that the date being used is the date of publication of this version of the packages.
Many had previous versions on CRAN:&lt;/p&gt;
&lt;template id=&#34;9668142b-64d5-4c3d-842e-fbcef8304c16&#34;&gt;&lt;style&gt;
.tabwid table{
  border-spacing:0px !important;
  border-collapse:collapse;
  line-height:1;
  margin-left:auto;
  margin-right:auto;
  border-width: 0;
  display: table;
  margin-top: 1.275em;
  margin-bottom: 1.275em;
  border-color: transparent;
}
.tabwid_left table{
  margin-left:0;
}
.tabwid_right table{
  margin-right:0;
}
.tabwid td {
    padding: 0;
}
.tabwid a {
  text-decoration: none;
}
.tabwid thead {
    background-color: transparent;
}
.tabwid tfoot {
    background-color: transparent;
}
.tabwid table tr {
background-color: transparent;
}
&lt;/style&gt;&lt;div class=&#34;tabwid&#34;&gt;&lt;style&gt;.cl-3baefb4c{}.cl-3ba22c8c{font-family:&#39;DejaVu Sans&#39;;font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-3ba253e2{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-3ba253ec{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-3ba2b7e2{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b7f6{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b7f7{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b800{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b80a{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b814{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}&lt;/style&gt;&lt;table class=&#39;cl-3baefb4c&#39;&gt;
&lt;thead&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b80a&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;First release&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b814&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;Packages&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b7e2&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;No&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b7f6&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;14,294&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b7f7&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;Yes&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b800&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;4,113&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;/template&gt;
&lt;div class=&#34;flextable-shadow-host&#34; id=&#34;1027b3f4-86a2-414b-90aa-a3bab733e0c0&#34;&gt;&lt;/div&gt;
&lt;script&gt;
var dest = document.getElementById(&#34;1027b3f4-86a2-414b-90aa-a3bab733e0c0&#34;);
var template = document.getElementById(&#34;9668142b-64d5-4c3d-842e-fbcef8304c16&#34;);
var caption = template.content.querySelector(&#34;caption&#34;);
if(caption) {
  caption.style.cssText = &#34;display:block;text-align:center;&#34;;
  var newcapt = document.createElement(&#34;p&#34;);
  newcapt.appendChild(caption)
  dest.parentNode.insertBefore(newcapt, dest.previousSibling);
}
var fantome = dest.attachShadow({mode: &#39;open&#39;});
var templateContent = template.content;
fantome.appendChild(templateContent);
&lt;/script&gt;

&lt;/div&gt;
&lt;div id=&#34;delays&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Processing time&lt;/h2&gt;
&lt;p&gt;Previously I found that &lt;a href=&#34;https://llrs.dev/post/2021/01/31/cran-review/&#34;&gt;CRAN submissions&lt;/a&gt; present some key differences between new packages and already published packages which impact how long do they need to wait to be published on CRAN.
With the existing data we can compare how fast is the process by comparing the published date with the build date.&lt;/p&gt;
&lt;p&gt;The build date is added to the tar.gz file automatically when the developer builds the package via &lt;code&gt;R CMD build&lt;/code&gt;. However, the published date is set by CRAN once the packages are accepted on CRAN.&lt;/p&gt;
&lt;p&gt;To visualize the differences I will also compare if there is some difference with new packages and those that were already on CRAN:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-delays&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-delays-1.png&#34; alt=&#34;Histogram of packages and the time between build and publication. They take less than 50 days usually.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 5: &lt;em&gt;Histogram of time difference between building and publishing a package.&lt;/em&gt; Color indicates if the package is new to CRAN or not. Most of the published packages take more or less the same time regardless of if it is the first time or not.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;There doesn’t seem to be much difference between date of building and date of publication according to if it is the first release or not.
The precision is just a day and this is usually a fast process well below 50 days.
Few packages exceed spend so much after build before publication and they are too few to be noticeable at this scale.
Since 2016/05/02 there is a &lt;a href=&#34;https://github.com/r-devel/r-svn/blob/676c1183801648b68f8f6719701445b2f9a5e3fd/src/library/tools/R/QC.R#L7583&#34;&gt;check&lt;/a&gt; that raises an issue if the build is older than a month.&lt;/p&gt;
&lt;p&gt;Note that one might need to build multiple times the package before it is accepted.
Packages published for the first time on CRAN might have been submitted previously, but when they finally built and pass the checks and manual review they are handled as fast as packages already on CRAN.&lt;/p&gt;
&lt;p&gt;However, this time between build and acceptance might have changed with time:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-delays2&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-delays2-1.png&#34; alt=&#34;Smoothed lines of published packages with different linetype and color depending on if it is the first time they are on CRAN or not. New packages currently take less than 4 days and old packages less than 2. This is down from 2018 to 2021, when new packages took above 4 days to be published on CRAN&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 6: &lt;em&gt;Processing time between building the package and being published by date.&lt;/em&gt; There is a high difference between new packages and old ones. New packages usually take more time while existing packages take less than a day currently.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;We clearly see a difference in processing time for those packages already on CRAN and those that are not.
Keep in mind that for the few packages from before 2016 the estimation might not be accurate.
At the same time this is consistent with the manual review process (For more information see &lt;a href=&#34;https://llrs.dev/post/2021/01/31/cran-review/&#34;&gt;my previous post&lt;/a&gt; about the review process of CRAN or my &lt;a href=&#34;https://llrs.dev/talk/user-2021/&#34;&gt;talk at the useR2021&lt;/a&gt;).
It also means that there is a huge variation of time about how packages are handled.
However this seems to be reducing: while in 2010 it took around 2 weeks, nowadays it takes less than a week and getting closer to a 1 day of median time between a package being built and appearing on CRAN that takes for existing packages.&lt;/p&gt;
&lt;p&gt;This difference might be explainable due to experience: authors and maintainers whose package(s) are already in CRAN know better how to submit a new version without problems the checks.&lt;/p&gt;
&lt;p&gt;It could also be that new packages need more time from the CRAN team.
In 2020 we see it took longer than in previous years for packages to be added on CRAN.
Maybe the increase in the processing time in 2020 was due the huge volume of submissions CRAN received or more checks on the developer side before submitting it to CRAN.&lt;/p&gt;
&lt;p&gt;Both explanations are not mutually exclusive.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;
More packages published the same day mean more processing time? It doesn’t look like it.
&lt;/summary&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-reasons&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-reasons-1.png&#34; alt=&#34;ggplot graphic with the time of processing time and the number of packages accepted the same day. New packages have less delay than already published packages, but the more packages are accepted, the less delay there is.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 7: &lt;em&gt;Packages accepted the same day and processing time.&lt;/em&gt;New packages are accepted sooner than packages on CRAN respect to the builddate.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Surprisingly, we see a lot of variation on the delay of packages already accepted on CRAN.
In addition, the more new packages accepted the same day, the less delay there is.
I think this just means that when reviewers work on the submission queue several packages might be approved.&lt;/p&gt;
&lt;p&gt;This might also mean packages have already been built several times before finally being accepted and now the errors, warnings and notes have been solved.
Last, this could indicate that developers with their package already on CRAN wait a bit between building and submitting the package as the developer might be taking some time to double check before submission (dependencies, on several machines, other?) or a time zone difference (submitting in the noon of a region but at the reviewers night).&lt;/p&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;There are packages that for 12 years have been working without problems despite the several major changes in R (See figure &lt;a href=&#34;#fig:daily-cran&#34;&gt;1&lt;/a&gt;).
This speaks volumes of the packages’ quality, and the backward compatibility that the R core aims and CRAN checks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;CRAN accepts an incredible amount of packages daily and monthly.
The system and the team are doing an incredible work mostly on their free time (See figure &lt;a href=&#34;#fig:monthly-cran&#34;&gt;2&lt;/a&gt;).
Many thanks!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accepted packages are handled very fast, in less than a week usually (See figure &lt;a href=&#34;#fig:cran-reasons&#34;&gt;7&lt;/a&gt;).
But it is not possible to distinguish alone time in the submission system and time on the developer computer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;future-parts&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Future parts&lt;/h2&gt;
&lt;p&gt;We’ve explored a snapshot of current packages and a brief window of all the history of CRAN.
There is much more that can be done with all the other files.&lt;/p&gt;
&lt;p&gt;On future posts I’ll explore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;patterns accepting packages and updates in packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;who handled the packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Size of packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the relation between dependencies, initial release and updates.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Other suggestions?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edit&lt;/strong&gt;: Many thanks to &lt;a href=&#34;https://masalmon.eu/&#34;&gt;Maëlle Salmon&lt;/a&gt; and &lt;a href=&#34;https://dirk.eddelbuettel.com/&#34;&gt;Dirk Eddelbuettel&lt;/a&gt; for their feedback on an initial version of this series of posts.&lt;/p&gt;
&lt;div id=&#34;reproducibility&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reproducibility&lt;/h3&gt;
&lt;details&gt;
&lt;pre&gt;&lt;code&gt;## - Session info -------------------------------------------------------------------------------------------------------
##  setting  value
##  version  R version 4.2.1 (2022-06-23)
##  os       Ubuntu 20.04.4 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  C
##  ctype    C
##  tz       Europe/Madrid
##  date     2022-07-23
##  pandoc   2.18 @ /usr/lib/rstudio/bin/quarto/bin/tools/ (via rmarkdown)
## 
## - Packages -----------------------------------------------------------------------------------------------------------
##  package      * version    date (UTC) lib source
##  assertthat     0.2.1      2019-03-21 [2] RSPM (R 4.2.0)
##  base64enc      0.1-3      2015-07-28 [2] CRAN (R 4.0.0)
##  blogdown       1.10       2022-05-10 [2] RSPM (R 4.2.0)
##  bookdown       0.27       2022-06-14 [2] RSPM (R 4.2.0)
##  bslib          0.4.0      2022-07-16 [2] RSPM (R 4.2.0)
##  cachem         1.0.6      2021-08-19 [2] RSPM (R 4.2.0)
##  cli            3.3.0      2022-04-25 [2] RSPM (R 4.2.0)
##  codetools      0.2-18     2020-11-04 [2] RSPM (R 4.2.0)
##  colorspace     2.0-3      2022-02-21 [2] RSPM (R 4.2.0)
##  crayon         1.5.1      2022-03-26 [2] RSPM (R 4.2.0)
##  curl           4.3.2      2021-06-23 [2] RSPM (R 4.2.0)
##  data.table     1.14.2     2021-09-27 [2] RSPM (R 4.2.0)
##  DBI            1.1.3      2022-06-18 [2] RSPM (R 4.2.0)
##  digest         0.6.29     2021-12-01 [2] RSPM (R 4.2.0)
##  dplyr        * 1.0.9      2022-04-28 [2] RSPM (R 4.2.0)
##  ellipsis       0.3.2      2021-04-29 [2] RSPM (R 4.2.0)
##  evaluate       0.15       2022-02-18 [2] RSPM (R 4.2.0)
##  fansi          1.0.3      2022-03-24 [2] RSPM (R 4.2.0)
##  farver         2.1.1      2022-07-06 [2] RSPM (R 4.2.0)
##  fastmap        1.1.0      2021-01-25 [2] RSPM (R 4.2.0)
##  flextable    * 0.7.2      2022-06-12 [2] RSPM (R 4.2.0)
##  forcats      * 0.5.1      2021-01-27 [2] RSPM (R 4.2.0)
##  gdtools        0.2.4      2022-02-14 [2] RSPM (R 4.2.0)
##  generics       0.1.3      2022-07-05 [2] RSPM (R 4.2.0)
##  geomtextpath * 0.1.0      2022-01-24 [2] CRAN (R 4.2.1)
##  ggplot2      * 3.3.6.9000 2022-06-29 [2] Github (tidyverse/ggplot2@7571122)
##  ggrepel      * 0.9.1      2021-01-15 [2] RSPM (R 4.2.0)
##  glue           1.6.2      2022-02-24 [2] RSPM (R 4.2.0)
##  gtable         0.3.0      2019-03-25 [2] CRAN (R 4.0.0)
##  highr          0.9        2021-04-16 [2] RSPM (R 4.2.0)
##  htmltools      0.5.3      2022-07-18 [2] RSPM (R 4.2.0)
##  jquerylib      0.1.4      2021-04-26 [2] RSPM (R 4.2.0)
##  jsonlite       1.8.0      2022-02-22 [2] RSPM (R 4.2.0)
##  knitr          1.39       2022-04-26 [2] RSPM (R 4.2.0)
##  labeling       0.4.2      2020-10-20 [2] RSPM (R 4.2.0)
##  lattice        0.20-45    2021-09-22 [3] CRAN (R 4.2.0)
##  lifecycle      1.0.1      2021-09-24 [2] RSPM (R 4.2.0)
##  lubridate    * 1.8.0      2021-10-07 [2] RSPM (R 4.2.0)
##  magrittr       2.0.3      2022-03-30 [2] RSPM (R 4.2.0)
##  Matrix         1.4-1      2022-03-23 [2] RSPM (R 4.2.0)
##  mgcv           1.8-40     2022-03-29 [2] RSPM (R 4.2.0)
##  munsell        0.5.0      2018-06-12 [2] RSPM (R 4.2.0)
##  nlme           3.1-158    2022-06-15 [2] RSPM (R 4.2.0)
##  officer        0.4.3      2022-06-12 [2] RSPM (R 4.2.0)
##  pillar         1.8.0      2022-07-18 [2] RSPM (R 4.2.0)
##  pkgconfig      2.0.3      2019-09-22 [2] RSPM (R 4.2.0)
##  purrr          0.3.4      2020-04-17 [2] RSPM (R 4.2.0)
##  R6             2.5.1      2021-08-19 [2] RSPM (R 4.2.0)
##  Rcpp           1.0.9      2022-07-08 [2] RSPM (R 4.2.0)
##  rlang          1.0.4      2022-07-12 [2] RSPM (R 4.2.0)
##  rmarkdown      2.14       2022-04-25 [2] RSPM (R 4.2.0)
##  rstudioapi     0.13       2020-11-12 [2] RSPM (R 4.2.0)
##  rversions    * 2.1.1      2021-05-31 [2] RSPM (R 4.2.0)
##  sass           0.4.2      2022-07-16 [2] RSPM (R 4.2.0)
##  scales         1.2.0      2022-04-13 [2] RSPM (R 4.2.0)
##  sessioninfo    1.2.2      2021-12-06 [2] RSPM (R 4.2.0)
##  stringi        1.7.8      2022-07-11 [2] RSPM (R 4.2.0)
##  stringr        1.4.0      2019-02-10 [2] RSPM (R 4.2.0)
##  systemfonts    1.0.4      2022-02-11 [2] RSPM (R 4.2.0)
##  textshaping    0.3.6      2021-10-13 [2] RSPM (R 4.2.0)
##  tibble         3.1.7      2022-05-03 [2] RSPM (R 4.2.0)
##  tidyr        * 1.2.0      2022-02-01 [2] RSPM (R 4.2.0)
##  tidyselect     1.1.2      2022-02-21 [2] RSPM (R 4.2.0)
##  utf8           1.2.2      2021-07-24 [2] RSPM (R 4.2.0)
##  uuid           1.1-0      2022-04-19 [2] RSPM (R 4.2.0)
##  vctrs          0.4.1      2022-04-13 [2] RSPM (R 4.2.0)
##  withr          2.5.0      2022-03-03 [2] RSPM (R 4.2.0)
##  xfun           0.31       2022-05-10 [2] RSPM (R 4.2.0)
##  xml2           1.3.3      2021-11-30 [2] RSPM (R 4.2.0)
##  yaml           2.3.5      2022-02-21 [2] RSPM (R 4.2.0)
##  zip            2.2.0      2021-05-31 [2] RSPM (R 4.2.0)
## 
##  [1] /home/lluis/bin/R/4.2.1
##  [2] /usr/lib/R/site-library
##  [3] /usr/lib/R/library
## 
## ----------------------------------------------------------------------------------------------------------------------&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
