<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>r | B101nfo</title>
    <link>https://llrs.dev/tags/r/</link>
      <atom:link href="https://llrs.dev/tags/r/index.xml" rel="self" type="application/rss+xml" />
    <description>r</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>If it is code you can copy and reuse (MIT) if it is text, please cite and reuse CC-BY 2024.</copyright><lastBuildDate>Sun, 05 May 2024 00:00:00 +0000</lastBuildDate>
    <image>
      <url>img/map[gravatar:%!s(bool=false) shape:circle]</url>
      <title>r</title>
      <link>https://llrs.dev/tags/r/</link>
    </image>
    
    <item>
      <title>Packaging R: getting in repositories</title>
      <link>https://llrs.dev/post/2024/05/05/packaging-r-getting-in/</link>
      <pubDate>Sun, 05 May 2024 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2024/05/05/packaging-r-getting-in/</guid>
      <description>


&lt;p&gt;After the previous post collecting information about repositories I want to collect here my thoughts on adding a package in a repository and how repositories are recognized.
As in the previous post this is built on the assumption that one already has a package or more and wants to distribute it.&lt;/p&gt;
&lt;p&gt;This is meant as a reflection of what is an R repository and not intended for R package developers.
However, their feedback is appreciated to consider how an ideal repository would be.&lt;/p&gt;
&lt;div id=&#34;package-submission&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Package submission&lt;/h2&gt;
&lt;p&gt;An R repository will have a way to incorporate a package.
CRAN submission process starts with &lt;a href=&#34;https://cran.r-project.org/submit.html&#34;&gt;a form&lt;/a&gt;, while Bioconductor is done through a &lt;a href=&#34;https://github.com/Bioconductor/Contributions/issues/new&#34;&gt;Github issue&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The process will usually then start with an automated process.
Until the automated process check hasn’t passed probably no one will look into the package submission.
This reduce the hours a human must dedicate to manage submissions.
If a man is kept in the loop one could appeal the automatic process contacting them, or if it is a random failing re-submitting the package again.&lt;/p&gt;
&lt;div class=&#34;float&#34;&gt;
&lt;img src=&#34;images/submissions.png&#34; alt=&#34;Package submission checks: first a check of the package, if it is not new a dependency check from the repository if all checks pass the package is added to the repository.&#34; /&gt;
&lt;div class=&#34;figcaption&#34;&gt;&lt;strong&gt;Package submission checks&lt;/strong&gt;: first a check of the package, if it is not new a dependency check from the repository if all checks pass the package is added to the repository.&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Generally a package must first pass a package quality check before it is considered for further integration test.
This integration test is usually checking the new version of a package with packages that depend on it, also known as reverse dependencies.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;package-maintenance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Package maintenance&lt;/h2&gt;
&lt;p&gt;Once a package is included in a repository it usually needs to be maintained.&lt;/p&gt;
&lt;p&gt;There are many moving pieces, chips architecture, OS, R, other packages.
This all lead that authors need to maintain the packages in good shape if they want it to remain useful to users.
Of course, if one doesn’t want to do that they do not need to create a repository to share their package.&lt;/p&gt;
&lt;div class=&#34;float&#34;&gt;
&lt;img src=&#34;images/checks.png&#34; alt=&#34;Graphic showing time and different R versions and checks. Repositories check the packages on them on multiple levels.&#34; /&gt;
&lt;div class=&#34;figcaption&#34;&gt;&lt;strong&gt;Graphic showing time and different R versions and checks.&lt;/strong&gt; Repositories check the packages on them on multiple levels.&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This leads that at any given time point there must be some tests for any given package under different conditions as shown in image 2.
This leads to the possibility of having a package archived from the repository for failing the checks in place.&lt;/p&gt;
&lt;p&gt;Repositories provide these checks as a service to the users.
They guarantee that R packages in the repository work well together and pass the same set of packages (mostly).
This is what leads to their reputation and usage among users (this is true beyond R, DEBIAN, Ubuntu, …).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;closing-remarks&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Closing remarks&lt;/h2&gt;
&lt;p&gt;There are several official repositories how the package submission works when a package is submitted to one but it is related, via dependencies to other repositories is a matter of another post.&lt;/p&gt;
&lt;p&gt;There are some discussion on what is an R repository.
The importance of CRAN and Bioconductor has lead to some confusion.
There are generally two meanings of what a cran-like repository is:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;p&gt;One where &lt;code&gt;install.packages()&lt;/code&gt; works (This is defined by how the files and binaries are organized and will be a theme for another time).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;One were all the checks described here are in place and &lt;code&gt;install.packages()&lt;/code&gt; works too.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;r-universe is using the first definition but could be used to generate repositories with checks that comply with the second definition.
Other repositories that use that are the &lt;a href=&#34;https://packagemanager.posit.co/client/#/&#34;&gt;&lt;em&gt;Posit&lt;/em&gt; Public &lt;em&gt;Package Manager&lt;/em&gt;&lt;/a&gt;, or the &lt;a href=&#34;https://r4pi.org/&#34;&gt;R4Pi repository&lt;/a&gt; (which provides binaries for Raspberry Pi OS).&lt;/p&gt;
&lt;p&gt;As the second definition is more strict I’ll focus on it as this post has explained.&lt;/p&gt;
&lt;p&gt;PS: This post might be edited as it has been siting in my computer for several months.
I prefer to post it and be improved with feedback, so let me know if you have any addition.&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>useR madrid: rtweet</title>
      <link>https://llrs.dev/talk/user-madrid-rtweet/</link>
      <pubDate>Thu, 29 Feb 2024 19:00:00 +0200</pubDate>
      <guid>https://llrs.dev/talk/user-madrid-rtweet/</guid>
      <description>


&lt;p&gt;This presentation was in Spanish. I shared the history of my involvement with rtweet and what is happening with the package and Twitter API.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Packaging R: repositories</title>
      <link>https://llrs.dev/post/2023/12/09/packaging-r-repositories/</link>
      <pubDate>Sat, 09 Dec 2023 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2023/12/09/packaging-r-repositories/</guid>
      <description>


&lt;p&gt;In this post I want to collect some thoughts about R repositories.
In R we have multiple repositories that store packages for users.
In this post I want to write about the purpose, functionality, benefits and drawbacks of R repositories and how packages are managed.
The goal is to summarize what I’ve learnt these last years about them.
I’ll also collect some information about them from various sources to make it easier for myself to find it later on.&lt;/p&gt;
&lt;p&gt;I am writing this because I am worried about the future of CRAN and R.
Due to multiple circumstances, the current position is not sustainable as is.
I hope that this post, will help me to understand the past, present and create some concrete steps to do.&lt;/p&gt;
&lt;div id=&#34;history&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;History&lt;/h1&gt;
&lt;p&gt;I was not there, but the first repository started around April 1997.
This repository is CRAN: the Comprehensive R Archive Network.
The &lt;a href=&#34;https://stat.ethz.ch/pipermail/r-devel/1997-April/017026.html&#34;&gt;first mention&lt;/a&gt; I found is already about changes in it, but it was not until the end of the month when &lt;a href=&#34;https://stat.ethz.ch/pipermail/r-announce/1997/000001.html&#34;&gt;it was announced&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;CRAN was created by a few volunteers, some of which are still mainting it 25 years later.
The current team is listed on &lt;a href=&#34;https://cran.r-project.org/CRAN_team.htm&#34;&gt;their website&lt;/a&gt;.
From the beginning it was “a collection of sites which carry identical material, consisting of the R&amp;amp;R R distribution(s), the contributed extensions, documentation for R, and binaries.”&lt;/p&gt;
&lt;p&gt;Omegahat was another repository created &lt;a href=&#34;https://omegahat.net/&#34;&gt;shortly after CRAN&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Omega project began in July, 1998, with discussions among designers responsible for three current statistical languages (S, R, and Lisp-Stat), with the idea of working together on new directions with special emphasis on web-based software, Java, the Java virtual machine, and distributed computing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Many developers of Omegahat were in the R Core or CRAN team.
It was available as a repository from the R source code but was removed definitely in version R 4.1, in 2021&lt;a href=&#34;#fn1&#34; class=&#34;footnote-ref&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Bioconductor, was the next major repository that appeared.
It was funded by Robert Gentleman and others in 2004 (it started the mailing list).
A paper describing it &lt;a href=&#34;https://doi.org/10.1186/gb-2004-5-10-r80&#34;&gt;appeared in late 2004&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;an initiative for the collaborative creation of extensible software for computational biology and bioinformatics.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Through its history repositories have evolved with R and R with them.
For example: R was released twice a year at the beginning, and Bioconductor did too.
But when R moved to be released once per year (in 2013 with version 3.0) Bioconductor kept using two releases a year.
This introduced some problems when installing packages from Bioconductor, when a single R release can be compatible with two Bioconductor releases&lt;a href=&#34;#fn2&#34; class=&#34;footnote-ref&#34; id=&#34;fnref2&#34;&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In other cases, checks have evolved.
For instance &lt;a href=&#34;https://en.wikipedia.org/wiki/Oracle_Solaris&#34;&gt;Solaris&lt;/a&gt; was used to test packages in CRAN until 2021, if I recall correctly, because it allowed to test in a proprietary C or C++ compiler.
This lead to discover bugs but also to more distress in R-package developers which had difficulties checking their packages in that environment.&lt;/p&gt;
&lt;p&gt;Other checks evolve with R, becoming more strict with time: In the early versions of R the use of NAMESPACE was not regulated.
But since R version 2.15, 2012 it was compulsory even for data-only packages&lt;a href=&#34;#fn3&#34; class=&#34;footnote-ref&#34; id=&#34;fnref3&#34;&gt;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;.
This was synchronized with repositories checks.&lt;/p&gt;
&lt;p&gt;Last, some goals/desires of CRAN are not fulfilled (or where abandoned).
For example, from the start CRAN aimed to have packages authenticated (see the bottom of &lt;a href=&#34;https://stat.ethz.ch/pipermail/r-announce/1997/000001.html&#34;&gt;the announcement&lt;/a&gt;).
This might be due to lack of time, resources or that the plans are in progress but require (volunteer) time.&lt;/p&gt;
&lt;p&gt;With time, different repositories arose:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;MRAN, which was available since September 17th, 2014 to July 1st, 2022.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Rstudio Public Package Manager later renamed &lt;a href=&#34;https://packagemanager.posit.co/&#34;&gt;Posit Public Package Manager&lt;/a&gt; has &lt;a href=&#34;https://posit.co/blog/the-road-to-building-ten-million-binaries/&#34;&gt;binaries for several OS&lt;/a&gt; since 2019.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is the &lt;a href=&#34;https://pkgs.r4pi.org/&#34;&gt;R4pi repository&lt;/a&gt; with binaries for Raspberry Pi.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I remember a proteomics repository available.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;rOpenSci started its own repository which later evolved into the &lt;a href=&#34;https://r-universe.org&#34;&gt;r-universe&lt;/a&gt;.
The r-universe currently can provide binaries of packages that are hosted in a git repository.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;literature&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Literature&lt;/h1&gt;
&lt;p&gt;The role and prominence of the repositories has lead to many articles being written about it.
I wanted to link and collect some of them for easier retrieval.&lt;/p&gt;
&lt;p&gt;I was wondering how CRAN is described by the volunteers that built it.
From the announcing email:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;CRAN is a collection of sites which carry identical material, consisting of the R&amp;amp;R R distribution(s), the contributed extensions, documentation for R, and binaries.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;From the &lt;a href=&#34;https://cran.r-project.org&#34;&gt;website&lt;/a&gt; (at 2023/12/09):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Initially there was R NEWS, with an article dedicated to CRAN and one to Omegahat too.
These articles usually describe new package additions but sometimes they also provide information about changes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2001-1-cran&#34;&gt;CRAN-2001-1&lt;/a&gt;: It list new packages, &lt;a href=&#34;https://journal.r-project.org/news/RN-2001-2-cran&#34;&gt;CRAN-2001-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2001-3-cran&#34;&gt;CRAN-2001-3&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/articles/RN-2001-008/&#34;&gt;Omegahat-2001-3&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2002-1-cran&#34;&gt;CRAN-2002-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2002-2-cran/&#34;&gt;CRAN-2002-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2002-3-cran/&#34;&gt;CRAN-2002-3&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2003-1-cran/&#34;&gt;CRAN-2003-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2003-2-cran/&#34;&gt;CRAN-2003-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2003-3-cran/&#34;&gt;CRAN-2003-3&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2004-1-cran/&#34;&gt;CRAN-2004-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2004-2-cran/&#34;&gt;CRAN-2004-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2005-1-cran/&#34;&gt;CRAN-2005-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2005-2-cran/&#34;&gt;CRAN-2005-2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Since 2006 there is also an article about Bioconductor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2006-2-cran/&#34;&gt;CRAN-2006-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2006-2-bioc&#34;&gt;Bioc-2006-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2007-1-cran/&#34;&gt;CRAN-2007-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2007-2-cran/&#34;&gt;CRAN-2007-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2007-2-bioc&#34;&gt;Bioc-2007-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2007-3-cran/&#34;&gt;CRAN-2007-3&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RN-2008-1-cran/&#34;&gt;CRAN-2008-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2008-1-bioc&#34;&gt;Bioc-2008-1&lt;/a&gt; &lt;a href=&#34;https://journal.r-project.org/news/RN-2008-2-cran/&#34;&gt;CRAN-2008-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RN-2008-2-bioc&#34;&gt;Bioc-2008-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Later it became the &lt;a href=&#34;https://journal.r-project.org/&#34;&gt;R Journal&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/issues/2009-1/RJ-2009-1.pdf&#34;&gt;CRAN-2009-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/issues/2009-2/RJ-2009-2.pdf&#34;&gt;CRAN-2009-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/issues/2010-1/RJ-2010-1.pdf&#34;&gt;CRAN-2010-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/issues/2010-2/RJ-2010-2.pdf&#34;&gt;CRAN-2010-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/issues/2011-1/RJ-2011-1.pdf&#34;&gt;CRAN-2011-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/issues/2011-2/RJ-2011-2.pdf&#34;&gt;CRAN and Bioconductor 2011-2&lt;/a&gt;.
In the bioconductor section it mentions conference, and important directions for the Bioconductor core.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/issues/2012-1/RJ-2012-1.pdf&#34;&gt;CRAN-2012-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/issues/2012-2/RJ-2012-2.pdf&#34;&gt;CRAN and Bioconductor 2012-2&lt;/a&gt;: Mentions &lt;code&gt;biocLite()&lt;/code&gt; to install packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2013-1-cran&#34;&gt;CRAN-2013-1&lt;/a&gt; &lt;a href=&#34;https://journal.r-project.org/news/RJ-2013-1-bioconductor/&#34;&gt;Bioc-2013-1&lt;/a&gt;: mentions better integration of parallel evaluation.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2013-2-cran/&#34;&gt;CRAN-2013-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2013-2-bioconductor/&#34;&gt;Bioc-2013-2&lt;/a&gt;: Mentions again AnnotationHub&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2014-1-cran/&#34;&gt;CRAN-2014-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2014-1-bioconductor/&#34;&gt;Bioc-2014-1&lt;/a&gt;: Mentions the git-svn bridge to synchronize git and svn repository.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2014-2-cran/&#34;&gt;CRAN-2014-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2014-2-bioconductor/&#34;&gt;Bioc-2014-2&lt;/a&gt;: Bioconductor 3.0 release, besides some packages Amazon Machine Image are offered as well as docker images.
Packages are required to pass BiocCheck, checks in a different package specific for Bioconductor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2015-1-cran/&#34;&gt;CRAN-2015-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2015-1-bioconductor/&#34;&gt;Bioc-2015-1&lt;/a&gt;: Same mentions as the previous and encouragement to guidelines an package submission.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2015-2-cran/&#34;&gt;CRAN-2015-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2015-2-bioconductor/&#34;&gt;Bioc-2015-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2016-1-cran/&#34;&gt;CRAN-2016-1&lt;/a&gt;: on this article there is a plot of the number of CRAN packages and time, and doesn’t list all packages listed.
It explicitly mentions that the CRAN team asked for help processing package submissions and some people stepped up.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2016-1-bioconductor/&#34;&gt;Bioc-2016-1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2016-2-cran/&#34;&gt;CRAN-2016-2&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2016-2-bioc/&#34;&gt;Bioc-2016-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2017-1-cran/&#34;&gt;CRAN-2017-1&lt;/a&gt;: mentions changes in CRAN checks, adding new memory access and static code analysis checks.
It mentions that the submission has moved to a more automated one.
It also mentions changes in the CRAN Repository Policy.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2017-1-bioc/&#34;&gt;Bioc-2017-1&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2018-1-cran/&#34;&gt;CRAN-2018-1&lt;/a&gt;: checks in alternative BLAS/LAPACK implementations, the submission pipeline is defined.
First time the amount of action taken by CRAN reviewers is listed in two categories automatic and manual.
Changes in repository policy are listed.
Changes in location of package repository archive , &lt;a href=&#34;https://journal.r-project.org/news/RJ-2018-1-bioc/&#34;&gt;Bioc-2018-1&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2018-2-cran/&#34;&gt;CRAN-2018-2&lt;/a&gt;: Changes in policy; packages should not give a check warning nor error.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2018-2-bioc/&#34;&gt;Bioc-2018-2&lt;/a&gt;: Moved to BiocManager to install packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2019-1-cran/&#34;&gt;CRAN-2019-1&lt;/a&gt;: More mentions to CRAN mirror security.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2019-2-cran/&#34;&gt;CRAN-2019-2&lt;/a&gt;: Updates in checklist for CRAN submissions, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2019-2-bioc/&#34;&gt;Bioc-2019-2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2020-1-cran/&#34;&gt;CRAN-2020-1&lt;/a&gt;: Many changes in CRAN policies.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2020-2-cran/&#34;&gt;CRAN-2020-2&lt;/a&gt;: Many changes to CRAN policies.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2020-2-bioc/&#34;&gt;Bioc-2020-2&lt;/a&gt;: Announces the Technical and Community advisory boards (as well as the project-wide Code of Conduct).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2021-1-cran/&#34;&gt;CRAN-2021-1&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2021-1-bioc/&#34;&gt;Bioc-2021-1&lt;/a&gt;: Mentions conferences that will be virtual.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2021-2-cran/&#34;&gt;CRAN-2021-2&lt;/a&gt;: Shows an &lt;a href=&#34;https://cran.r-project.org/incoming/&#34;&gt;incomig&lt;/a&gt; path [See &lt;a href=&#34;https://r-hub.github.io/cransays/articles/dashboard.html&#34;&gt;this friendly viewer&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2021-2-bioc/&#34;&gt;Bioc-2021-2&lt;/a&gt;: Mentions AnVIL and two online workshops to develop workflows.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-1-cran/&#34;&gt;CRAN-2022-1&lt;/a&gt;: List a change in CRAN policy and the CRAN Task View Initiative.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-2-cran/&#34;&gt;CRAN-2022-2&lt;/a&gt;: List some more repository policies.
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-2-bioconductor/&#34;&gt;Bioc-2022-2&lt;/a&gt;: Lists infrastructure updates (and its funding), changes in the core team and new initiatives.&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-3-cran/&#34;&gt;CRAN-2022-3&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-3-bioconductor/&#34;&gt;Bioc-2022-3&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-4-cran/&#34;&gt;CRAN-2022-4&lt;/a&gt;, &lt;a href=&#34;https://journal.r-project.org/news/RJ-2022-4-bioconductor/&#34;&gt;Bioc-2022-4&lt;/a&gt;: default branch renaming, partnership with Outreachy and blog are featured.
Several working groups provide updates&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://journal.r-project.org/news/RJ-2023-1-cran/&#34;&gt;CRAN-2023-1&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In addition, several articles and blog posts have appeared.
From those I found it is worth mentioning the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://doi.org/10.17713/ajs.v41i1.188&#34;&gt;Are There Too Many R Packages?&lt;/a&gt; and &lt;a href=&#34;https://www.r-bloggers.com/2014/04/does-r-have-too-many-packages/&#34;&gt;derived posts&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://www.jumpingrivers.com/blog/security-r-hacking-bioconductor/&#34;&gt;Hacking Bioconductor&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And my own posts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2021/12/07/reasons-cran-archivals/&#34;&gt;Reasons CRAN packages are archived&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/&#34;&gt;CRAN files part 1&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/&#34;&gt;CRAN files part 2&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2023/05/03/cran-maintained-packages/&#34;&gt;CRAN maintained packages&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2021/01/31/cran-review/&#34;&gt;CRAN review&lt;/a&gt; (and the &lt;a href=&#34;https://llrs.dev/talk/user-2021/&#34;&gt;talk at useRs 2021&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2020/07/31/bioconductor-submissions-reviews/&#34;&gt;Bioconductor review&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/post/2020/09/02/ropensci-submissions/&#34;&gt;rOpenSci&lt;/a&gt; reviews&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://llrs.dev/2020/07/bioconductor-submissions-reviews/&#34;&gt;Bioconductor reviews&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The article &lt;a href=&#34;https://journal.r-project.org/articles/RJ-2009-014/&#34;&gt;“Aspects of the Social Organization and Trajectory of the R Project”&lt;/a&gt;, from the R Journal 2009, also has a section about CRAN, noting that it “is challenged by its own success”.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;characteristics&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Characteristics&lt;/h1&gt;
&lt;p&gt;The predominance of CRAN and its role as primary and default R repository has lead to some special treatment of the repository.&lt;/p&gt;
&lt;p&gt;CRAN checks are in the R source code itself.
While other repositories have their own checks in different tools.
In addition, the CRAN environmental variables used are documented in the &lt;a href=&#34;https://cran.r-project.org/doc/manuals/r-release/R-ints.html&#34;&gt;R-internals&lt;/a&gt; (they are more or less accessible in the &lt;a href=&#34;https://svn.r-project.org/R-dev-web/trunk/CRAN/&#34;&gt;svn repository&lt;/a&gt; too).&lt;/p&gt;
&lt;p&gt;Others who know more have stated the benefits of CRAN too: This text is copied from Henrik Bengstsson in &lt;a href=&#34;https://community-bioc.slack.com/archives/CLF37V6C8/p1698869264884649?thread_ts=1698804037.467439&amp;amp;cid=CLF37V6C8&#34; title=&#34;Link to the thread&#34;&gt;Bioconductor Slack&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;FOREVER ARCHIVE:&lt;/p&gt;
&lt;p&gt;The first one is that it publishes packages and versions of them until the end of time.
When a package has been published on CRAN, it takes a lot for it to be removed from there.
I don’t know if it ever happened, but I can imagine a package can be fully removed if it was illegally published in the first place (e.g. copyright, illegal content, ...) or malicious.&lt;/p&gt;
&lt;p&gt;INSTALLATION SERVICE:&lt;/p&gt;
&lt;p&gt;Then CRAN also provides a R package repository service for installing packages on CRAN using built-in R functions.
The set of packages in the package repo is a subset of all packages on CRAN.
The CRAN package repo makes a promise that all packages listed in PACKAGES can be installed.
If they cannot make that promise, they’ll archive the package (=remove it from PACKAGES).
I should also say, install.packages(url) can be used to install from the set of packages that are archived.
Technically, old package versions are always archived.&lt;/p&gt;
&lt;p&gt;CHECK SERVICE:&lt;/p&gt;
&lt;p&gt;The content of the R package repository is guided by the CRAN package checks that run on R-oldrel, R-release, and R-devel across multiple platforms.
The minimal requirement is that no package should remain in the package repository if the checks detects ERRORs (and those errors are not due to recently introduced bugs in R-devel).
WARNINGs can also cause a package to be archived, but that process often takes longer.
AFAIK, NOTEs are not a cause for a package being archived (but I could be wrong).
The CRAN incoming checks, which you have to pass when you submit a new package, or an updated version, will make sure that the published package pass with all OKs.
(It’s possible to argue for NOTEs being false positives, or for them not to be fixed, but that requires a manual approval by the CRAN Team).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I think there are many more resources discussing R repositories.
If you know more I’ll be happy to update this post.&lt;/p&gt;
&lt;p&gt;I think before I drag too much on the steps I’ll post this and collect some more articles I might have missed.&lt;/p&gt;
&lt;p&gt;Last, Uwe Liegges presented about &lt;a href=&#34;https://www.youtube.com/watch?v=-vX-CDiiZKI&#34;&gt;CRAN in useR!2017&lt;/a&gt;, thanks Tim Taylor for &lt;a href=&#34;https://fosstodon.org/@_TimTaylor/111612010185631808&#34;&gt;sharing it&lt;/a&gt;. In this video there is an explanation of why the solaris OS was used.&lt;/p&gt;
&lt;p&gt;It has come to my attention that there is an article, by G. Brooke Anderson and Dirk Eddelbuette, about the R package repositories structure (among other things): &lt;a href=&#34;https://journal.r-project.org/archive/2017/RJ-2017-026/RJ-2017-026.pdf&#34;&gt;Hosting Data Packages via drat: A Case Study with Hurricane Exposure Data&lt;/a&gt;&lt;/p&gt;
&lt;div id=&#34;reproducibility&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reproducibility&lt;/h3&gt;
&lt;details&gt;
&lt;pre&gt;&lt;code&gt;## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.1 (2023-06-16)
##  os       Ubuntu 22.04.3 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Madrid
##  date     2024-01-15
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.18    2023-06-19 [1] CRAN (R 4.3.1)
##  bookdown      0.37    2023-12-01 [1] CRAN (R 4.3.1)
##  bslib         0.6.1   2023-11-28 [1] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.1)
##  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.1)
##  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
##  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
##  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
##  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.1)
##  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.1)
##  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.1)
##  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.1)
##  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
##  sass          0.4.8   2023-12-06 [1] CRAN (R 4.3.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
##  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.2)
##  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.1)
## 
##  [1] /home/lluis/bin/R/4.3.1
##  [2] /opt/R/4.3.1/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&#34;footnotes footnotes-end-of-document&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn1&#34;&gt;&lt;p&gt;In version 3.1.2 &lt;a href=&#34;https://cran.r-project.org/doc/manuals/NEWS.3&#34;&gt;Omegahat didn’t provide&lt;/a&gt; Windows binaries and in 4.1 from the default repositories (See 4.1 in &lt;a href=&#34;https://cran.r-project.org/doc/manuals/r-release/NEWS.html&#34;&gt;NEWS(.4)&lt;/a&gt;).&lt;a href=&#34;#fnref1&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn2&#34;&gt;&lt;p&gt;This lead to the need of having a special function to install packages from Bioconductor.
Initially a function &lt;code&gt;biocLite&lt;/code&gt; and later with the &lt;a href=&#34;https://cran.r-project.org/package=BiocManager&#34;&gt;BiocManager package&lt;/a&gt;.&lt;a href=&#34;#fnref2&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn3&#34;&gt;&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/doc/manuals/NEWS.2&#34;&gt;NEWS in 2.15 section&lt;/a&gt;&lt;a href=&#34;#fnref3&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>BaseSet 0.9.0</title>
      <link>https://llrs.dev/post/2023/08/23/baseset-0-9-0/</link>
      <pubDate>Wed, 23 Aug 2023 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2023/08/23/baseset-0-9-0/</guid>
      <description>


&lt;p&gt;I’m excited to provide a new release of &lt;a href=&#34;https://cran.r-project.org/package=BaseSet&#34;&gt;BaseSet&lt;/a&gt;, the package implementing a a class and methods to work with (fuzzy) sets.&lt;/p&gt;
&lt;p&gt;This release was focused on making it easier to work with it.&lt;/p&gt;
&lt;p&gt;From the beginning it was engineered towards the tidyverse and this time I focused on general R methods like &lt;code&gt;$&lt;/code&gt;, &lt;code&gt;[&lt;/code&gt;, &lt;code&gt;c&lt;/code&gt;:&lt;/p&gt;
&lt;div id=&#34;new-methods&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;New methods&lt;/h2&gt;
&lt;p&gt;First we can create a TidySet or TS for short:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(&amp;quot;BaseSet&amp;quot;, warn.conflicts = FALSE)
packageVersion(&amp;quot;BaseSet&amp;quot;)
## [1] &amp;#39;0.9.0&amp;#39;
l &amp;lt;- list(A = &amp;quot;1&amp;quot;,
     B = c(&amp;quot;1&amp;quot;, &amp;quot;2&amp;quot;),
     C = c(&amp;quot;2&amp;quot;, &amp;quot;3&amp;quot;, &amp;quot;4&amp;quot;),
     D = c(&amp;quot;1&amp;quot;, &amp;quot;2&amp;quot;, &amp;quot;3&amp;quot;, &amp;quot;4&amp;quot;)
)
TS &amp;lt;- tidySet(l)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Up till now there was no compatibility with the base R methods but there was with the tidyverse.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;TSa &amp;lt;- TS[[&amp;quot;A&amp;quot;]]
TSb &amp;lt;- TS[[&amp;quot;B&amp;quot;]]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Maybe this doesn’t look much but previously it wasn’t possible to subset the class.
Initially I thought that working with a single class per session would be enough.
Later I realized that maybe people would have good reasons to split or combine multiple objects:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;TSab &amp;lt;- c(TSa, TSb)
TSab
##   elements sets fuzzy
## 1        1    A     1
## 2        1    B     1
## 3        2    B     1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that subsetting by sets does not produce the same object as elements are kept:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;dim(TSab)
##  Elements Relations      Sets 
##         2         3         2
dim(TS[1:2, &amp;quot;sets&amp;quot;])
##  Elements Relations      Sets 
##         4         3         2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You’ll need to drop the elements:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;dim(droplevels(TS[1:2, &amp;quot;sets&amp;quot;]))
##  Elements Relations      Sets 
##         2         3         2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can include more information like this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;TSab[1:2, &amp;quot;relations&amp;quot;, &amp;quot;type&amp;quot;] &amp;lt;- c(&amp;quot;new&amp;quot;, &amp;quot;addition&amp;quot;)
TSab[1:2, &amp;quot;sets&amp;quot;, &amp;quot;origin&amp;quot;] &amp;lt;- c(&amp;quot;fake&amp;quot;, &amp;quot;real&amp;quot;)
TSab
##   elements sets fuzzy     type origin
## 1        1    A     1      new   fake
## 2        1    B     1 addition   real
## 3        2    B     1     &amp;lt;NA&amp;gt;   real&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With this release is easier to access the columns of the TidySet:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;TSab$type
## [1] &amp;quot;new&amp;quot;      &amp;quot;addition&amp;quot; NA
TSab$origin
## [1] &amp;quot;fake&amp;quot; &amp;quot;real&amp;quot;
TS$sets
##  [1] &amp;quot;A&amp;quot; &amp;quot;B&amp;quot; &amp;quot;B&amp;quot; &amp;quot;C&amp;quot; &amp;quot;C&amp;quot; &amp;quot;C&amp;quot; &amp;quot;D&amp;quot; &amp;quot;D&amp;quot; &amp;quot;D&amp;quot; &amp;quot;D&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you pay attention you’ll realize that it will look at the minimum information required.
But if the column is present in the relations and elements or sets slots it will pick the first.&lt;/p&gt;
&lt;p&gt;You can use:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;TS[, &amp;quot;sets&amp;quot;, &amp;quot;new&amp;quot;] &amp;lt;- &amp;quot;a&amp;quot;
TS[, &amp;quot;sets&amp;quot;, &amp;quot;new&amp;quot;]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I recommend reading carefully the help page of &lt;code&gt;?`extract-TidySet`&lt;/code&gt; and make some tests based on the examples.
I might have created some bugs or friction points with the extraction operations, let me know and I’ll address them (That’s the reason why I kept it below a 1.0 release).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;more-usable&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;More usable&lt;/h1&gt;
&lt;p&gt;Another usability addition to the class is the possibility to autocomplete.&lt;/p&gt;
&lt;p&gt;Now if you tab &lt;code&gt;TS$ty&lt;/code&gt; and press TAB it should complete to &lt;code&gt;TS$type&lt;/code&gt; because there is a column called type. This will make it easier to use the &lt;code&gt;$&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;With this release, we can now check the number of sets and the number of relations each set has:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;length(TS)
## [1] 4
lengths(TS)
## A B C D 
## 1 2 3 4&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;new-function&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;New function&lt;/h2&gt;
&lt;p&gt;The new function &lt;code&gt;union_closed&lt;/code&gt; checks if the combinations of sets produce already existing sets.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;union_closed(TS, sets = c(&amp;quot;A&amp;quot;, &amp;quot;B&amp;quot;, &amp;quot;C&amp;quot;))
## [1] FALSE
union_closed(TS)
## [1] TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;next-steps&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Next steps&lt;/h1&gt;
&lt;p&gt;I hope this makes it even easier to work with the class.
Combine different objects, and manipulate it more intuitively.&lt;/p&gt;
&lt;p&gt;While creating this document I realized it has some friction points.&lt;br /&gt;
In next release it will be possible to:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Subset the object by element or set name, if only querying elements and sets slots.
For example &lt;code&gt;TS[c(&#34;3&#34;, &#34;4&#34;), &#34;elements&#34;, &#34;NEWS&#34;] &amp;lt;- TRUE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;names&lt;/code&gt; and &lt;code&gt;dimnames&lt;/code&gt; to discover which data is in the object.&lt;/li&gt;
&lt;li&gt;Some bug fixes about these methods.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Enjoy!&lt;/p&gt;
&lt;p&gt;I would also apreciate to hear some feedback about how you are using the package.
It will help me to direct the development/maintenance of the package wherever it is more useful.&lt;/p&gt;
&lt;div id=&#34;reproducibility&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reproducibility&lt;/h3&gt;
&lt;details&gt;
&lt;pre&gt;&lt;code&gt;## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.1 (2023-06-16)
##  os       Ubuntu 22.04.3 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Madrid
##  date     2023-12-18
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  BaseSet     * 0.9.0   2023-08-23 [1] local
##  blogdown      1.18    2023-06-19 [1] CRAN (R 4.3.1)
##  bookdown      0.37    2023-12-01 [1] CRAN (R 4.3.1)
##  bslib         0.6.1   2023-11-28 [1] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.1)
##  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.1)
##  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
##  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.1)
##  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
##  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.1)
##  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
##  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.1)
##  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
##  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.1)
##  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.1)
##  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
##  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.1)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.1)
##  rlang         1.1.2   2023-11-04 [1] CRAN (R 4.3.1)
##  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
##  sass          0.4.8   2023-12-06 [1] CRAN (R 4.3.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
##  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.1)
##  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.1)
##  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.2)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.1)
##  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.2)
##  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.1)
## 
##  [1] /home/lluis/bin/R/4.3.1
##  [2] /opt/R/4.3.1/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>experDesign: follow up</title>
      <link>https://llrs.dev/post/2023/04/09/experdesign-follow-up/</link>
      <pubDate>Sun, 09 Apr 2023 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2023/04/09/experdesign-follow-up/</guid>
      <description>


&lt;p&gt;I am happy to announce a new release of experDesign.
Install it from CRAN with:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;experDesign&amp;quot;)
library(&amp;quot;experDesign&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This new release has focused in more tricky aspects when designing an experiment:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Checking the samples of your experiment.&lt;/li&gt;
&lt;li&gt;How to continue stratifying your conditions after some initial batch.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These functions should be used before carrying out anything once you have your samples collected.
You can use these functions and make an informed decision of what might happen with your experiment.&lt;/p&gt;
&lt;div id=&#34;checking-your-samples&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Checking your samples&lt;/h1&gt;
&lt;p&gt;The new function &lt;code&gt;check_data()&lt;/code&gt; will warn you if it finds some known issues with your data.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(&amp;quot;experDesign&amp;quot;)
library(&amp;quot;MASS&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we take the survey dataset from the MASS package we can see that it has some issues:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;data(survey, package = &amp;quot;MASS&amp;quot;)
check_data(survey)
## Warning: Two categorical variables don&amp;#39;t have all combinations.
## Warning: Some values are missing.
## Warning: There is a combination of categories with no replicates; i.e. just one
## sample.
## [1] FALSE&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;While if we fabricate our own dataset we might realize we have a problem&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rdata &amp;lt;- expand.grid(sex = c(&amp;quot;M&amp;quot;, &amp;quot;F&amp;quot;), class = c(&amp;quot;lower&amp;quot;, &amp;quot;median&amp;quot;, &amp;quot;high&amp;quot;))
stopifnot(&amp;quot;Same samples/rows as combinations of classes&amp;quot; = nrow(rdata) == 2*3)
check_data(rdata)
## Warning: There is a combination of categories with no replicates; i.e. just one
## sample.
## [1] FALSE
# We create some new samples with the same conditions
rdata2 &amp;lt;- rbind(rdata, rdata)
check_data(rdata2)
## [1] TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;One might decide to go ahead with what is available or use only some of those samples or wait to collect more samples for the experiment&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;follow-up&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Follow up&lt;/h1&gt;
&lt;p&gt;Imagine you have 100 samples that you distribute in 4 batches of 25 samples each.
Later, you collect 80 more samples to analyze.
You want these new samples to be analyzed together with those previous 100 samples.
Will it be possible? How should you distribute your new samples in groups of 25?&lt;/p&gt;
&lt;p&gt;Using the same dataset from &lt;code&gt;MASS&lt;/code&gt; imagine if we first collected 118 observations and later 119 more:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;survey1 &amp;lt;- survey[1:118, ]
survey2 &amp;lt;- survey[119:nrow(survey), ]
# Using low number of iterations to speed the process 
# you should even use higher number than the default
fu &amp;lt;- follow_up(survey1, survey2, size_subset = 50, iterations = 10)
## Warning: There are some problems with the data.
## Warning: There are some problems with the new samples and the batches.
## Warning: There are some problems with the new data.
## Warning: There are some problems with the old data.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Following the previous new function it reports if there are problems with the observations.
One can check each collection with &lt;code&gt;check_data&lt;/code&gt; to know more about the problems found.&lt;/p&gt;
&lt;p&gt;If you have already performed the experiment on your observations you can also check the distribution:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Create the first batch
variables &amp;lt;- c(&amp;quot;Sex&amp;quot;, &amp;quot;Smoke&amp;quot;, &amp;quot;Age&amp;quot;)
survey1 &amp;lt;- survey1[, variables]
index1 &amp;lt;- design(survey1, size_subset = 50, iterations = 10)
## Warning: There might be some problems with the data use check_data().
r_survey &amp;lt;- inspect(index1, survey1)
# Create the second batch with &amp;quot;new&amp;quot; students
survey2 &amp;lt;- survey2[, variables]
survey2$batch &amp;lt;- NA
# Prepare the follow up
all_classroom &amp;lt;- rbind(r_survey, survey2)
fu2 &amp;lt;- follow_up2(all_classroom, size_subset = 50, iterations = 10)
## Warning: There are some problems with the data.
## Warning: There are some problems with the new samples and the batches.
## Warning: There are some problems with the new data.
## Warning: There are some problems with the old data.
tail(fu2)
## [1] &amp;quot;NewSubset2&amp;quot; &amp;quot;NewSubset2&amp;quot; &amp;quot;NewSubset2&amp;quot; &amp;quot;NewSubset2&amp;quot; &amp;quot;NewSubset2&amp;quot;
## [6] &amp;quot;NewSubset3&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using this function will help to decide which new observations go to which new batches.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;closing-remarks&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Closing remarks&lt;/h1&gt;
&lt;p&gt;The famous quote from Fisher goes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“To consult the statistician after an experiment is finished is often merely to ask him to conduct a &lt;em&gt;post mortem&lt;/em&gt; examination. He can perhaps say what the experiment died of.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This emphasizes the importance of involving a statistician early on in the experimental design process.&lt;br /&gt;
Unfortunately, in some cases, it may be too late to involve a statistician in the experimental design process or the reality of unforeseen circumstances messed the design of your carefully planned experiment.&lt;/p&gt;
&lt;p&gt;My aim with this package is to provide practical tools for statisticians, bioinformaticians, and anyone who works with data.
These tools are designed to be easy to use and can be used to analyze data in a variety of contexts.
Let me know if it is helpful in your case.&lt;/p&gt;
&lt;div id=&#34;reproducibility&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reproducibility&lt;/h3&gt;
&lt;details&gt;
&lt;pre&gt;&lt;code&gt;## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.2 (2022-10-31)
##  os       Ubuntu 22.04.2 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language en_US
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Madrid
##  date     2023-04-09
##  pandoc   2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version  date (UTC) lib source
##  blogdown      1.16     2022-12-13 [1] CRAN (R 4.2.2)
##  bookdown      0.33     2023-03-06 [1] CRAN (R 4.2.2)
##  bslib         0.4.2    2022-12-16 [1] CRAN (R 4.2.2)
##  cachem        1.0.7    2023-02-24 [1] CRAN (R 4.2.2)
##  cli           3.6.1    2023-03-23 [1] CRAN (R 4.2.2)
##  digest        0.6.31   2022-12-11 [1] CRAN (R 4.2.2)
##  evaluate      0.20     2023-01-17 [1] CRAN (R 4.2.2)
##  experDesign * 0.2.0    2023-04-05 [1] CRAN (R 4.2.2)
##  fastmap       1.1.1    2023-02-24 [1] CRAN (R 4.2.2)
##  htmltools     0.5.4    2022-12-07 [1] CRAN (R 4.2.2)
##  jquerylib     0.1.4    2021-04-26 [1] CRAN (R 4.2.2)
##  jsonlite      1.8.4    2022-12-06 [1] CRAN (R 4.2.2)
##  knitr         1.42     2023-01-25 [1] CRAN (R 4.2.2)
##  MASS        * 7.3-58.1 2022-08-03 [2] CRAN (R 4.2.2)
##  R6            2.5.1    2021-08-19 [1] CRAN (R 4.2.2)
##  rlang         1.1.0    2023-03-14 [1] CRAN (R 4.2.2)
##  rmarkdown     2.20     2023-01-19 [1] CRAN (R 4.2.2)
##  rstudioapi    0.14     2022-08-22 [1] CRAN (R 4.2.2)
##  sass          0.4.5    2023-01-24 [1] CRAN (R 4.2.2)
##  sessioninfo   1.2.2    2021-12-06 [1] CRAN (R 4.2.2)
##  xfun          0.37     2023-01-31 [1] CRAN (R 4.2.2)
##  yaml          2.3.7    2023-01-23 [1] CRAN (R 4.2.2)
## 
##  [1] /home/lluis/bin/R/4.2.2
##  [2] /opt/R/4.2.2/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Accessing REDCap from R</title>
      <link>https://llrs.dev/post/2023/02/08/accessing-redcap-from-r/</link>
      <pubDate>Wed, 08 Feb 2023 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2023/02/08/accessing-redcap-from-r/</guid>
      <description>


&lt;p&gt;In this post, I want to summarize some of the packages to connect to &lt;a href=&#34;https://www.project-redcap.org/&#34;&gt;REDCap&lt;/a&gt;.
For those who don’t know, REDCap is a database designed for clinical usage, which allows easy data collection of patients’ responses by clinicians and interactions with the patients via surveys.&lt;/p&gt;
&lt;p&gt;It has specific features such as scheduling surveys sent to patients, compatibility with tablets and mobile phones for data entry while visiting patients, grouping data in instruments (for repeating the same questions multiple times), multiple choice and check buttons, and different arms (like paths for patients).
Most importantly is relatively easy to manage by clinical administrators.&lt;/p&gt;
&lt;p&gt;In CRAN there are ~11 &lt;a href=&#34;https://search.r-project.org/?P=REDCap&amp;amp;SORT=&amp;amp;HITSPERPAGE=10&amp;amp;DB=cran-info&amp;amp;DEFAULTOP=and&amp;amp;FMT=query&amp;amp;xDB=all&amp;amp;xFILTERS=.%7E%7E&#34;&gt;packages mentioning it&lt;/a&gt; at the time of writing it.
The purpose of this post is to help decide which packages can be helpful in which situations.
This post won’t be a deep analysis or comparison of capabilities, it describes some of the best and worse features of each package.&lt;/p&gt;
&lt;div id=&#34;redcapr&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;REDCapR&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=REDCapR&#34;&gt;REDCapR&lt;/a&gt; is the official package to connect to the database.
It allows you to read, write and filter the requests.
It has some security-related functions.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;redcaptidier&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;REDCapTidieR&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=REDCapTidieR&#34;&gt;REDCapTidieR&lt;/a&gt; is a package that provides summaries of tables and helps with nested tibbles data by arm.
It depends on REDCapR.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;tidyredcap&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;tidyREDCap&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=tidyREDCap&#34;&gt;tidyREDCap&lt;/a&gt; is a package that simplifies the tables for instruments and choose-all or choose-one question types.
It is easy to make tables and it depends on REDCapR.
It requires the first and last columns to make instruments.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/redcap_design.jpg&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Screenshot of a design with several instruments in a single arm (from &lt;a href=&#34;https://www.project-redcap.org/&#34; class=&#34;uri&#34;&gt;https://www.project-redcap.org/&lt;/a&gt;)&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;redcapexporter&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;REDCapExporter&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=REDCapExporter&#34;&gt;REDCapExporter&lt;/a&gt; is a package to build a data package from a database for redistribution.
It does not depend on REDCapR.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;redcapapi&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;redcapAPI&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=redcapAPI&#34;&gt;redcapAPI&lt;/a&gt; is a package for making data accessible and analysis-ready as quickly as possible with huge documentation in a wiki but has no vignette or examples and it does not depend on REDCapR.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;redcapdm&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;REDCapDM&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=REDCapDM&#34;&gt;REDCapDM&lt;/a&gt; is a package that provides functions to read and manage REDCap data and identify missing or extreme values as well as transform the data provided by the API.
It depends on REDCapR.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;reviewr&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;ReviewR&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=ReviewR&#34;&gt;ReviewR&lt;/a&gt; is a package that creates a shiny website with data from the database to explore it.
It uses the REDCapR to connect to your instance.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;rccola&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;rccola&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=rccola&#34;&gt;rccola&lt;/a&gt; is a package to provide a secure connection to the database but it doesn’t provide any handling of the data.
It uses redcapAPI to connect to the database.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://llrs.dev/post/2023/02/08/accessing-redcap-from-r/index.en_files/figure-html/unnamed-chunk-1-1.png&#34; alt=&#34;Barplot with the dependencies: from less to more: REDCapExporter, rccola, redcapAPI, REDCapR, tidyREDCap, REDCapDM, REDCapTidieR, ReviewR&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;other-packages&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Other packages&lt;/h2&gt;
&lt;p&gt;Other packages mention REDCap:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/package=nmadb&#34;&gt;nmadb&lt;/a&gt;: which implements its own connection procedure for a specific REDCap database of network meta-analyses.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/package=distcomp&#34;&gt;distcomp&lt;/a&gt;: Allows to do computation on a distributed data also in REDCap.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/package=cgmanalysis&#34;&gt;cgmanalysis&lt;/a&gt;: which mentions that data produced is compatible with REDCap.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;I’m sure that many packages briefly described here can do much more than what I understood from a glance at their documentation and DESCRIPTION.&lt;/p&gt;
&lt;p&gt;Most packages provide some data for the examples (and probably tests), while others do not.
This is a technical problem that might impact users if there are no examples in the functions.&lt;/p&gt;
&lt;p&gt;REDCapR is used by most packages to access the database, but most of the packages focus on transforming the data provided by the API (or data exported) or the exported data.
It highlights that the data exported is useful but that depending on the preferences of the users it needs to be transformed for easy usage.&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Exploring CRAN&#39;s files: part 2</title>
      <link>https://llrs.dev/post/2022/07/28/cran-files-2/</link>
      <pubDate>Thu, 28 Jul 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/07/28/cran-files-2/</guid>
      <description>


&lt;div id=&#34;introduction&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In the &lt;a href=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/&#34;&gt;first post&lt;/a&gt; of the series we briefly explored packages available on CRAN.
Now I’ll focus on history of the packages and its size using the following files:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;packages &amp;lt;- tools::CRAN_package_db()
current &amp;lt;- tools:::CRAN_current_db()
archive &amp;lt;- tools:::CRAN_archive_db()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this part we will use two files: The &lt;code&gt;current&lt;/code&gt; and the &lt;code&gt;archive&lt;/code&gt;, let’s see why.&lt;/p&gt;
&lt;div id=&#34;current-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;current file&lt;/h3&gt;
&lt;p&gt;The current database has has the package size, dates of modification, which I assume is date added to CRAN and user name of who last modified it.
This is the same information returned by &lt;a href=&#34;https://search.r-project.org/R/refmans/base/html/file.info.html&#34;&gt;&lt;code&gt;file.info&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;current[1, 1:10]
##     size isdir mode               mtime               ctime               atime
## A3 42810 FALSE  664 2015-08-16 23:05:54 2022-09-03 12:02:27 2022-09-03 14:00:19
##     uid  gid  uname    grname
## A3 1001 1001 hornik cranadmin&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;archive-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;archive file&lt;/h3&gt;
&lt;p&gt;The archive database returns the same information, but as you might guess by the name it doesn’t provide information about current packages but for packages in the archive and no longer available by default.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;archive[[1]]
##                     size isdir mode               mtime               ctime
## A3/A3_0.9.1.tar.gz 45252 FALSE  664 2013-02-07 10:00:29 2022-08-22 18:14:53
## A3/A3_0.9.2.tar.gz 45907 FALSE  664 2013-03-26 19:58:40 2022-08-22 18:14:53
##                                  atime  uid  gid  uname    grname
## A3/A3_0.9.1.tar.gz 2022-08-22 17:39:50 1001 1001 hornik cranadmin
## A3/A3_0.9.2.tar.gz 2022-08-22 17:39:50 1010 1001 ligges cranadmin&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The date matches that available on the &lt;a href=&#34;https://cran.r-project.org/src/contrib/Archive/A3/&#34;&gt;web’s old sources&lt;/a&gt;, so we can be confident of it’s meaning.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;cran-history&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;CRAN history&lt;/h2&gt;
&lt;p&gt;As we have seen there are some files about the archives of CRAN.
These include information about date of modification (moving/editing) and user who did it and of course name and sometimes version of the package.
These archives are the great treasure of CRAN because they help to make reproducible long time ago run experiments or analysis.&lt;/p&gt;
&lt;p&gt;Note that I’m not totally sure that this archive contains the full record of packages, some initial packages might be missing.
I’m also aware of some packages removed by CRAN which do not longer appear on this records.&lt;/p&gt;
&lt;p&gt;Nevertheless, this should provide an accurate picture of packages available through time.
Also as there is no information when a package is archived (here, &lt;a href=&#34;https://llrs.dev/post/2021/12/07/reasons-cran-archivals/&#34;&gt;there is on PACKAGES.in&lt;/a&gt;) so I might overestimate the packages available at any given moment.&lt;/p&gt;
&lt;p&gt;Remember the plot about &lt;a href=&#34;#accepted&#34;&gt;acceptance of packages on CRAN?&lt;/a&gt;
That plot only looked at current packages available, let’s check it with all the archive:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:accumulative-packages&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/accumulative-packages-1.png&#34; alt=&#34;*Packages on CRAN archive by their addition to it.* There are over 125000 archives on CRAN.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: &lt;em&gt;Packages on CRAN archive by their addition to it.&lt;/em&gt; There are over 125000 archives on CRAN.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;All these packages come from packages with few releases and packages with many releases.
If we look at which packages had the most releases:&lt;/p&gt;
&lt;template id=&#34;41fb6fac-ce02-4889-ac51-217e365f4058&#34;&gt;&lt;style&gt;
.tabwid table{
  border-spacing:0px !important;
  border-collapse:collapse;
  line-height:1;
  margin-left:auto;
  margin-right:auto;
  border-width: 0;
  display: table;
  margin-top: 1.275em;
  margin-bottom: 1.275em;
  border-color: transparent;
}
.tabwid_left table{
  margin-left:0;
}
.tabwid_right table{
  margin-right:0;
}
.tabwid td {
    padding: 0;
}
.tabwid a {
  text-decoration: none;
}
.tabwid thead {
    background-color: transparent;
}
.tabwid tfoot {
    background-color: transparent;
}
.tabwid table tr {
background-color: transparent;
}
.katex-display {
    margin: 0 0 !important;
}
&lt;/style&gt;&lt;div class=&#34;tabwid&#34;&gt;&lt;style&gt;.cl-e305f260{}.cl-e2fc13c6{font-family:&#39;DejaVu Sans&#39;;font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-e2fc2fdc{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-e2fc2fe6{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-e2fc7a46{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a5a{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a64{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a6e{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a6f{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a82{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a8c{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a96{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7a97{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aa0{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aa1{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aaa{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7aab{width:100.6pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-e2fc7ab4{width:69.7pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}&lt;/style&gt;&lt;table class=&#39;cl-e305f260&#39;&gt;
&lt;thead&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7aab&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;package&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7ab4&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Releases&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;spatstat&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;206&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Matrix&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;204&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a6f&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;mgcv&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a82&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;162&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;RcppArmadillo&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;150&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;rgdal&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;146&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;nlme&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;143&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;caret&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;139&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;spdep&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;139&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;lattice&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;137&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;plotrix&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;131&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a6f&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;sp&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a82&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;128&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;XML&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;126&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;Rcmdr&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;123&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a97&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;lme4&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aa0&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;122&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;gstat&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;121&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a8c&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;arm&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a96&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;119&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;foreign&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;117&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a5a&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;party&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a46&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;117&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7a64&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;maptools&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7a6e&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;113&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-e2fc7aa1&#34;&gt;&lt;p class=&#34;cl-e2fc2fdc&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;raster&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-e2fc7aaa&#34;&gt;&lt;p class=&#34;cl-e2fc2fe6&#34;&gt;&lt;span class=&#34;cl-e2fc13c6&#34;&gt;108&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;/template&gt;
&lt;div class=&#34;flextable-shadow-host&#34; id=&#34;c207439a-5643-4e95-950e-721182ef54dd&#34;&gt;&lt;/div&gt;
&lt;script&gt;
var dest = document.getElementById(&#34;c207439a-5643-4e95-950e-721182ef54dd&#34;);
var template = document.getElementById(&#34;41fb6fac-ce02-4889-ac51-217e365f4058&#34;);
var caption = template.content.querySelector(&#34;caption&#34;);
if(caption) {
  caption.style.cssText = &#34;display:block;text-align:center;&#34;;
  var newcapt = document.createElement(&#34;p&#34;);
  newcapt.appendChild(caption)
  dest.parentNode.insertBefore(newcapt, dest.previousSibling);
}
var fantome = dest.attachShadow({mode: &#39;open&#39;});
var templateContent = template.content;
fantome.appendChild(templateContent);
&lt;/script&gt;

&lt;p&gt;Surprisingly there are packages with more than 200 versions on CRAN!&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-distribution&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-distribution-1.png&#34; alt=&#34;*Releases distirbution*. Packages and number of releases&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: &lt;em&gt;Releases distirbution&lt;/em&gt;. Packages and number of releases
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Most packages have 1 release, usually packages have 3, but the mean is around 6.&lt;/p&gt;
&lt;p&gt;Given all this different versions of packages how big are all the packages on CRAN?&lt;/p&gt;
&lt;div id=&#34;cran-size&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CRAN size&lt;/h3&gt;
&lt;p&gt;Have you ever wondered how big is CRAN? According to the memory size of the source packages all CRAN source packages are approximately 96.8 Gb.&lt;/p&gt;
&lt;p&gt;This doesn’t include binaries for multiple architectures and OS.
The package size might indicate whether the package has considerable amount of data.&lt;/p&gt;
&lt;p&gt;Looking back to the size of the packages along time we can see this pattern:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:packages-size&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/packages-size-1.png&#34; alt=&#34;*Package and their median size.* Archived packages have become bigger since 2014. Packages on CRAN have been getting bigger since 2017.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 3: &lt;em&gt;Package and their median size.&lt;/em&gt; Archived packages have become bigger since 2014. Packages on CRAN have been getting bigger since 2017.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Packages available on CRAN are smaller than those no longer on CRAN.
But versions of packages on CRAN that got archived are usually bigger than current versions.
Packages no longer on CRAN are usually bigger.
Median size of packages is increasing (quickly).&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-size&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-size-1.png&#34; alt=&#34;*Size of package with releases.* Package are usually small but seem to gain weight when updating.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 4: &lt;em&gt;Size of package with releases.&lt;/em&gt; Package are usually small but seem to gain weight when updating.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Typically packages increase their size with each new release up to when they reach 50 releases.
For higher releases this plot depends on very few packages and might not be representative.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:release-size2&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/release-size2-1.png&#34; alt=&#34;*Size of package with releases by availability.* Packages no longer in CRAN are usually smaller than those in it. The continous black line is CRAN&#39;s current threshold, while the discontinous black line is current median size.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 5: &lt;em&gt;Size of package with releases by availability.&lt;/em&gt; Packages no longer in CRAN are usually smaller than those in it. The continous black line is CRAN’s current threshold, while the discontinous black line is current median size.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Here we can appreciate better how packages tend to be below the CRAN threshold.
There isn’t much of a difference between packages available on CRAN and those archived.&lt;/p&gt;
&lt;p&gt;If we look at the size of package of the first release over time we’ll see a representative view:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:size-time&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/28/cran-files-2/index.en_files/figure-html/size-time-1.png&#34; alt=&#34;*Size of the first release by time*. Package size increases with time with a peak around 2010 and increasing again since 2014 but still hasn&#39;t surprased the previous record.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 6: &lt;em&gt;Size of the first release by time&lt;/em&gt;. Package size increases with time with a peak around 2010 and increasing again since 2014 but still hasn’t surprased the previous record.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Package size tends to increase except for the brief period 2010-2014.
Currently it increases less than before that period but is close to its maximum.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusions&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Most packages are not updated too much, between 1 and 3 times.
But there are packages that are updated quite a lot, this might mean they are data packages and not software packages or that they have frequent minor and major updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most current packages have smaller size than those archived.
Packages no longer available usually had bigger size than those packages still on CRAN.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Surprisingly packages increase their size a lot till the 25 release.
But also with time except for a period in 2010 and 2014.
This decreasing period might be due to a change in CRAN policy.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;future-parts&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Future parts&lt;/h2&gt;
&lt;p&gt;On future posts I’ll explore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;patterns accepting packages and updates in packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the relation between dependencies, initial release and updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;who handled the packages.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Exploring CRAN&#39;s files: part 1</title>
      <link>https://llrs.dev/post/2022/07/23/cran-files-1/</link>
      <pubDate>Sat, 23 Jul 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/07/23/cran-files-1/</guid>
      <description>


&lt;div id=&#34;introduction&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;There are many great things in base R, one of them is the &lt;a href=&#34;https://search.r-project.org/R/refmans/tools/html/00Index.html&#34;&gt;tools package&lt;/a&gt;.
This package has the functions that are used to build, check and create packages, documentation and manuals.&lt;/p&gt;
&lt;p&gt;As I wanted to know how CRAN works and its changes I was looking into the source code of tools.
I found some internal functions that access freely available files with information about CRAN packages.
These private functions are at the &lt;a href=&#34;https://svn.r-project.org/R/trunk/src/library/tools/R/CRANtools.R&#34;&gt;CRANtools.R file&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;packages &amp;lt;- tools::CRAN_package_db()
# current &amp;lt;- tools:::CRAN_current_db()
# archive &amp;lt;- tools:::CRAN_archive_db()
# issues &amp;lt;- tools::CRAN_check_issues()
# alias &amp;lt;- tools:::CRAN_aliases_db()
# rdxrefs &amp;lt;- tools:::CRAN_rdxrefs_db()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As I was not sure of the information on these files I asked on &lt;a href=&#34;https://stat.ethz.ch/pipermail/r-devel/2022-May/081770.html&#34;&gt;R-devel&lt;/a&gt; but I did not receive an answer.
They seem to be quite obscure and as private functions might be removed without notice and shouldn’t be used in any dependency.
However, as the files contain information about CRAN they might provide interesting clues about the history of CRAN and how it is operated.&lt;/p&gt;
&lt;p&gt;On this post I will focus on the first file.
I’ll explore a couple of fields and in future posts I will use the other files to explore more about CRAN history.&lt;/p&gt;
&lt;div id=&#34;packages-file&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;packages file&lt;/h3&gt;
&lt;p&gt;First of all a very brief exploration of what is in this file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;##    Package Version Priority                        Depends
## 1       A3   1.0.0     &amp;lt;NA&amp;gt; R (&amp;gt;= 2.15.0), xtable, pbapply
## 2 AATtools   0.0.1     &amp;lt;NA&amp;gt;                   R (&amp;gt;= 3.6.0)
## 3   ABACUS   1.0.0     &amp;lt;NA&amp;gt;                   R (&amp;gt;= 3.1.0)
##                                 Imports LinkingTo
## 1                                  &amp;lt;NA&amp;gt;      &amp;lt;NA&amp;gt;
## 2  magrittr, dplyr, doParallel, foreach      &amp;lt;NA&amp;gt;
## 3 ggplot2 (&amp;gt;= 3.1.0), shiny (&amp;gt;= 1.3.1),      &amp;lt;NA&amp;gt;
##                               Suggests Enhances    License License_is_FOSS
## 1                  randomForest, e1071     &amp;lt;NA&amp;gt; GPL (&amp;gt;= 2)            &amp;lt;NA&amp;gt;
## 2                                 &amp;lt;NA&amp;gt;     &amp;lt;NA&amp;gt;      GPL-3            &amp;lt;NA&amp;gt;
## 3 rmarkdown (&amp;gt;= 1.13), knitr (&amp;gt;= 1.22)     &amp;lt;NA&amp;gt;      GPL-3            &amp;lt;NA&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Packages has similar information as &lt;code&gt;available.packages()&lt;/code&gt; but with many more columns with published date, reverse dependencies, X-CRAN-Comment, who packaged it…
Also note that all this packages are not filtered to match R version, OS_type, subarch and there are almost duplicates (I learned about this filtering while reading the great documentation of &lt;a href=&#34;https://search.r-project.org/R/refmans/utils/html/available.packages.html&#34;&gt;&lt;code&gt;available.packages()&lt;/code&gt;&lt;/a&gt; and also finding some mentions online).&lt;/p&gt;
&lt;p&gt;As we have data from several years I’ll sometimes show the release dates of different R versions to provide some context.
Without further delay let’s explore the data!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;accepted&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Published packages&lt;/h2&gt;
&lt;p&gt;CRAN started some time ago (in 1997) but it hasn’t remained frozen.
The package archive (the A in CRAN) has been updating since then.
For instance the current packages do not include packages that were removed, archived or those replaced by updates.&lt;/p&gt;
&lt;p&gt;First packages are submitted to CRAN and once accepted they are published.
As accepted and published usually are almost instantaneous I might use them as synonyms.
Looking at the current available packages and their publication date, we can see the following:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:daily-cran&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/daily-cran-1.png&#34; alt=&#34;ggplot2 plot of date vs packages accepted on a given day. Until2020 less than 10 packages were accepted daily. Lately more than 30 are added to CRAN. The plot also displays the R release versions from 2.12 in 2010 to 4.2.0 in 2022.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: &lt;em&gt;Packages accepted on CRAN by the publication date.&lt;/em&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The oldest package added was in 2010.
This means a package without issues, dependencies changes, bugs detected by the automatic checks since 12 years!&lt;/p&gt;
&lt;p&gt;The daily rate of acceptance has increased from less than 10 a day till 2020 to more than 30 this year 2022.
If we summarize that information for month we see the same, but the little bump in 2020 disappears but we see other patterns:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:monthly-cran&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/monthly-cran-1.png&#34; alt=&#34;ggplot figure with the monthly published packages. till 2015 it raises very slowly, then in is around 50 monthly packages and there are some wobbles. In 2022 it raised to over 800 packages.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: &lt;em&gt;Monthly packages published to CRAN&lt;/em&gt;. Some monthly variance is observed.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Instead of just one bump we see some waves with less packages on CRAN accepted late in the year and an increase of packages the first months of the year.&lt;/p&gt;
&lt;p&gt;If we look at the accumulated packages on CRAN we see an exponential growth:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-cumsum&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-cumsum-1.png&#34; alt=&#34;Plot with the accumulative number of packages in CRAN. Raising from a few 10 to currently more than 18000.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 3: &lt;em&gt;Acumulation of packages&lt;/em&gt;. Most of the packages have been published in the last 2 years.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In fact, most packages currently on CRAN where added since March 2021 than all the previous years.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-perc&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-perc-1.png&#34; alt=&#34;Line with percentages of packages in CRAN by date. Close to 50% of current packages were published between 2010 and 2021.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 4: &lt;em&gt;Percentage of current packages on CRAN according to their date of publication&lt;/em&gt;. Most of them were published/updated on the last year and a half.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This is a good time to remind that the date being used is the date of publication of this version of the packages.
Many had previous versions on CRAN:&lt;/p&gt;
&lt;template id=&#34;9668142b-64d5-4c3d-842e-fbcef8304c16&#34;&gt;&lt;style&gt;
.tabwid table{
  border-spacing:0px !important;
  border-collapse:collapse;
  line-height:1;
  margin-left:auto;
  margin-right:auto;
  border-width: 0;
  display: table;
  margin-top: 1.275em;
  margin-bottom: 1.275em;
  border-color: transparent;
}
.tabwid_left table{
  margin-left:0;
}
.tabwid_right table{
  margin-right:0;
}
.tabwid td {
    padding: 0;
}
.tabwid a {
  text-decoration: none;
}
.tabwid thead {
    background-color: transparent;
}
.tabwid tfoot {
    background-color: transparent;
}
.tabwid table tr {
background-color: transparent;
}
&lt;/style&gt;&lt;div class=&#34;tabwid&#34;&gt;&lt;style&gt;.cl-3baefb4c{}.cl-3ba22c8c{font-family:&#39;DejaVu Sans&#39;;font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-3ba253e2{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-3ba253ec{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-3ba2b7e2{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b7f6{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b7f7{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b800{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b80a{width:88.3pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-3ba2b814{width:72.5pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}&lt;/style&gt;&lt;table class=&#39;cl-3baefb4c&#39;&gt;
&lt;thead&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b80a&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;First release&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b814&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;Packages&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b7e2&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;No&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b7f6&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;14,294&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style=&#34;overflow-wrap:break-word;&#34;&gt;&lt;td class=&#34;cl-3ba2b7f7&#34;&gt;&lt;p class=&#34;cl-3ba253e2&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;Yes&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class=&#34;cl-3ba2b800&#34;&gt;&lt;p class=&#34;cl-3ba253ec&#34;&gt;&lt;span class=&#34;cl-3ba22c8c&#34;&gt;4,113&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;/template&gt;
&lt;div class=&#34;flextable-shadow-host&#34; id=&#34;1027b3f4-86a2-414b-90aa-a3bab733e0c0&#34;&gt;&lt;/div&gt;
&lt;script&gt;
var dest = document.getElementById(&#34;1027b3f4-86a2-414b-90aa-a3bab733e0c0&#34;);
var template = document.getElementById(&#34;9668142b-64d5-4c3d-842e-fbcef8304c16&#34;);
var caption = template.content.querySelector(&#34;caption&#34;);
if(caption) {
  caption.style.cssText = &#34;display:block;text-align:center;&#34;;
  var newcapt = document.createElement(&#34;p&#34;);
  newcapt.appendChild(caption)
  dest.parentNode.insertBefore(newcapt, dest.previousSibling);
}
var fantome = dest.attachShadow({mode: &#39;open&#39;});
var templateContent = template.content;
fantome.appendChild(templateContent);
&lt;/script&gt;

&lt;/div&gt;
&lt;div id=&#34;delays&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Processing time&lt;/h2&gt;
&lt;p&gt;Previously I found that &lt;a href=&#34;https://llrs.dev/post/2021/01/31/cran-review/&#34;&gt;CRAN submissions&lt;/a&gt; present some key differences between new packages and already published packages which impact how long do they need to wait to be published on CRAN.
With the existing data we can compare how fast is the process by comparing the published date with the build date.&lt;/p&gt;
&lt;p&gt;The build date is added to the tar.gz file automatically when the developer builds the package via &lt;code&gt;R CMD build&lt;/code&gt;. However, the published date is set by CRAN once the packages are accepted on CRAN.&lt;/p&gt;
&lt;p&gt;To visualize the differences I will also compare if there is some difference with new packages and those that were already on CRAN:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-delays&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-delays-1.png&#34; alt=&#34;Histogram of packages and the time between build and publication. They take less than 50 days usually.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 5: &lt;em&gt;Histogram of time difference between building and publishing a package.&lt;/em&gt; Color indicates if the package is new to CRAN or not. Most of the published packages take more or less the same time regardless of if it is the first time or not.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;There doesn’t seem to be much difference between date of building and date of publication according to if it is the first release or not.
The precision is just a day and this is usually a fast process well below 50 days.
Few packages exceed spend so much after build before publication and they are too few to be noticeable at this scale.
Since 2016/05/02 there is a &lt;a href=&#34;https://github.com/r-devel/r-svn/blob/676c1183801648b68f8f6719701445b2f9a5e3fd/src/library/tools/R/QC.R#L7583&#34;&gt;check&lt;/a&gt; that raises an issue if the build is older than a month.&lt;/p&gt;
&lt;p&gt;Note that one might need to build multiple times the package before it is accepted.
Packages published for the first time on CRAN might have been submitted previously, but when they finally built and pass the checks and manual review they are handled as fast as packages already on CRAN.&lt;/p&gt;
&lt;p&gt;However, this time between build and acceptance might have changed with time:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-delays2&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-delays2-1.png&#34; alt=&#34;Smoothed lines of published packages with different linetype and color depending on if it is the first time they are on CRAN or not. New packages currently take less than 4 days and old packages less than 2. This is down from 2018 to 2021, when new packages took above 4 days to be published on CRAN&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 6: &lt;em&gt;Processing time between building the package and being published by date.&lt;/em&gt; There is a high difference between new packages and old ones. New packages usually take more time while existing packages take less than a day currently.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;We clearly see a difference in processing time for those packages already on CRAN and those that are not.
Keep in mind that for the few packages from before 2016 the estimation might not be accurate.
At the same time this is consistent with the manual review process (For more information see &lt;a href=&#34;https://llrs.dev/post/2021/01/31/cran-review/&#34;&gt;my previous post&lt;/a&gt; about the review process of CRAN or my &lt;a href=&#34;https://llrs.dev/talk/user-2021/&#34;&gt;talk at the useR2021&lt;/a&gt;).
It also means that there is a huge variation of time about how packages are handled.
However this seems to be reducing: while in 2010 it took around 2 weeks, nowadays it takes less than a week and getting closer to a 1 day of median time between a package being built and appearing on CRAN that takes for existing packages.&lt;/p&gt;
&lt;p&gt;This difference might be explainable due to experience: authors and maintainers whose package(s) are already in CRAN know better how to submit a new version without problems the checks.&lt;/p&gt;
&lt;p&gt;It could also be that new packages need more time from the CRAN team.
In 2020 we see it took longer than in previous years for packages to be added on CRAN.
Maybe the increase in the processing time in 2020 was due the huge volume of submissions CRAN received or more checks on the developer side before submitting it to CRAN.&lt;/p&gt;
&lt;p&gt;Both explanations are not mutually exclusive.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;
More packages published the same day mean more processing time? It doesn’t look like it.
&lt;/summary&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:cran-reasons&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://llrs.dev/post/2022/07/23/cran-files-1/index.en_files/figure-html/cran-reasons-1.png&#34; alt=&#34;ggplot graphic with the time of processing time and the number of packages accepted the same day. New packages have less delay than already published packages, but the more packages are accepted, the less delay there is.&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 7: &lt;em&gt;Packages accepted the same day and processing time.&lt;/em&gt;New packages are accepted sooner than packages on CRAN respect to the builddate.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Surprisingly, we see a lot of variation on the delay of packages already accepted on CRAN.
In addition, the more new packages accepted the same day, the less delay there is.
I think this just means that when reviewers work on the submission queue several packages might be approved.&lt;/p&gt;
&lt;p&gt;This might also mean packages have already been built several times before finally being accepted and now the errors, warnings and notes have been solved.
Last, this could indicate that developers with their package already on CRAN wait a bit between building and submitting the package as the developer might be taking some time to double check before submission (dependencies, on several machines, other?) or a time zone difference (submitting in the noon of a region but at the reviewers night).&lt;/p&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;There are packages that for 12 years have been working without problems despite the several major changes in R (See figure &lt;a href=&#34;#fig:daily-cran&#34;&gt;1&lt;/a&gt;).
This speaks volumes of the packages’ quality, and the backward compatibility that the R core aims and CRAN checks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;CRAN accepts an incredible amount of packages daily and monthly.
The system and the team are doing an incredible work mostly on their free time (See figure &lt;a href=&#34;#fig:monthly-cran&#34;&gt;2&lt;/a&gt;).
Many thanks!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accepted packages are handled very fast, in less than a week usually (See figure &lt;a href=&#34;#fig:cran-reasons&#34;&gt;7&lt;/a&gt;).
But it is not possible to distinguish alone time in the submission system and time on the developer computer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;future-parts&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Future parts&lt;/h2&gt;
&lt;p&gt;We’ve explored a snapshot of current packages and a brief window of all the history of CRAN.
There is much more that can be done with all the other files.&lt;/p&gt;
&lt;p&gt;On future posts I’ll explore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;patterns accepting packages and updates in packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;who handled the packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Size of packages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the relation between dependencies, initial release and updates.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Other suggestions?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edit&lt;/strong&gt;: Many thanks to &lt;a href=&#34;https://masalmon.eu/&#34;&gt;Maëlle Salmon&lt;/a&gt; and &lt;a href=&#34;https://dirk.eddelbuettel.com/&#34;&gt;Dirk Eddelbuettel&lt;/a&gt; for their feedback on an initial version of this series of posts.&lt;/p&gt;
&lt;div id=&#34;reproducibility&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reproducibility&lt;/h3&gt;
&lt;details&gt;
&lt;pre&gt;&lt;code&gt;## - Session info -------------------------------------------------------------------------------------------------------
##  setting  value
##  version  R version 4.2.1 (2022-06-23)
##  os       Ubuntu 20.04.4 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  C
##  ctype    C
##  tz       Europe/Madrid
##  date     2022-07-23
##  pandoc   2.18 @ /usr/lib/rstudio/bin/quarto/bin/tools/ (via rmarkdown)
## 
## - Packages -----------------------------------------------------------------------------------------------------------
##  package      * version    date (UTC) lib source
##  assertthat     0.2.1      2019-03-21 [2] RSPM (R 4.2.0)
##  base64enc      0.1-3      2015-07-28 [2] CRAN (R 4.0.0)
##  blogdown       1.10       2022-05-10 [2] RSPM (R 4.2.0)
##  bookdown       0.27       2022-06-14 [2] RSPM (R 4.2.0)
##  bslib          0.4.0      2022-07-16 [2] RSPM (R 4.2.0)
##  cachem         1.0.6      2021-08-19 [2] RSPM (R 4.2.0)
##  cli            3.3.0      2022-04-25 [2] RSPM (R 4.2.0)
##  codetools      0.2-18     2020-11-04 [2] RSPM (R 4.2.0)
##  colorspace     2.0-3      2022-02-21 [2] RSPM (R 4.2.0)
##  crayon         1.5.1      2022-03-26 [2] RSPM (R 4.2.0)
##  curl           4.3.2      2021-06-23 [2] RSPM (R 4.2.0)
##  data.table     1.14.2     2021-09-27 [2] RSPM (R 4.2.0)
##  DBI            1.1.3      2022-06-18 [2] RSPM (R 4.2.0)
##  digest         0.6.29     2021-12-01 [2] RSPM (R 4.2.0)
##  dplyr        * 1.0.9      2022-04-28 [2] RSPM (R 4.2.0)
##  ellipsis       0.3.2      2021-04-29 [2] RSPM (R 4.2.0)
##  evaluate       0.15       2022-02-18 [2] RSPM (R 4.2.0)
##  fansi          1.0.3      2022-03-24 [2] RSPM (R 4.2.0)
##  farver         2.1.1      2022-07-06 [2] RSPM (R 4.2.0)
##  fastmap        1.1.0      2021-01-25 [2] RSPM (R 4.2.0)
##  flextable    * 0.7.2      2022-06-12 [2] RSPM (R 4.2.0)
##  forcats      * 0.5.1      2021-01-27 [2] RSPM (R 4.2.0)
##  gdtools        0.2.4      2022-02-14 [2] RSPM (R 4.2.0)
##  generics       0.1.3      2022-07-05 [2] RSPM (R 4.2.0)
##  geomtextpath * 0.1.0      2022-01-24 [2] CRAN (R 4.2.1)
##  ggplot2      * 3.3.6.9000 2022-06-29 [2] Github (tidyverse/ggplot2@7571122)
##  ggrepel      * 0.9.1      2021-01-15 [2] RSPM (R 4.2.0)
##  glue           1.6.2      2022-02-24 [2] RSPM (R 4.2.0)
##  gtable         0.3.0      2019-03-25 [2] CRAN (R 4.0.0)
##  highr          0.9        2021-04-16 [2] RSPM (R 4.2.0)
##  htmltools      0.5.3      2022-07-18 [2] RSPM (R 4.2.0)
##  jquerylib      0.1.4      2021-04-26 [2] RSPM (R 4.2.0)
##  jsonlite       1.8.0      2022-02-22 [2] RSPM (R 4.2.0)
##  knitr          1.39       2022-04-26 [2] RSPM (R 4.2.0)
##  labeling       0.4.2      2020-10-20 [2] RSPM (R 4.2.0)
##  lattice        0.20-45    2021-09-22 [3] CRAN (R 4.2.0)
##  lifecycle      1.0.1      2021-09-24 [2] RSPM (R 4.2.0)
##  lubridate    * 1.8.0      2021-10-07 [2] RSPM (R 4.2.0)
##  magrittr       2.0.3      2022-03-30 [2] RSPM (R 4.2.0)
##  Matrix         1.4-1      2022-03-23 [2] RSPM (R 4.2.0)
##  mgcv           1.8-40     2022-03-29 [2] RSPM (R 4.2.0)
##  munsell        0.5.0      2018-06-12 [2] RSPM (R 4.2.0)
##  nlme           3.1-158    2022-06-15 [2] RSPM (R 4.2.0)
##  officer        0.4.3      2022-06-12 [2] RSPM (R 4.2.0)
##  pillar         1.8.0      2022-07-18 [2] RSPM (R 4.2.0)
##  pkgconfig      2.0.3      2019-09-22 [2] RSPM (R 4.2.0)
##  purrr          0.3.4      2020-04-17 [2] RSPM (R 4.2.0)
##  R6             2.5.1      2021-08-19 [2] RSPM (R 4.2.0)
##  Rcpp           1.0.9      2022-07-08 [2] RSPM (R 4.2.0)
##  rlang          1.0.4      2022-07-12 [2] RSPM (R 4.2.0)
##  rmarkdown      2.14       2022-04-25 [2] RSPM (R 4.2.0)
##  rstudioapi     0.13       2020-11-12 [2] RSPM (R 4.2.0)
##  rversions    * 2.1.1      2021-05-31 [2] RSPM (R 4.2.0)
##  sass           0.4.2      2022-07-16 [2] RSPM (R 4.2.0)
##  scales         1.2.0      2022-04-13 [2] RSPM (R 4.2.0)
##  sessioninfo    1.2.2      2021-12-06 [2] RSPM (R 4.2.0)
##  stringi        1.7.8      2022-07-11 [2] RSPM (R 4.2.0)
##  stringr        1.4.0      2019-02-10 [2] RSPM (R 4.2.0)
##  systemfonts    1.0.4      2022-02-11 [2] RSPM (R 4.2.0)
##  textshaping    0.3.6      2021-10-13 [2] RSPM (R 4.2.0)
##  tibble         3.1.7      2022-05-03 [2] RSPM (R 4.2.0)
##  tidyr        * 1.2.0      2022-02-01 [2] RSPM (R 4.2.0)
##  tidyselect     1.1.2      2022-02-21 [2] RSPM (R 4.2.0)
##  utf8           1.2.2      2021-07-24 [2] RSPM (R 4.2.0)
##  uuid           1.1-0      2022-04-19 [2] RSPM (R 4.2.0)
##  vctrs          0.4.1      2022-04-13 [2] RSPM (R 4.2.0)
##  withr          2.5.0      2022-03-03 [2] RSPM (R 4.2.0)
##  xfun           0.31       2022-05-10 [2] RSPM (R 4.2.0)
##  xml2           1.3.3      2021-11-30 [2] RSPM (R 4.2.0)
##  yaml           2.3.5      2022-02-21 [2] RSPM (R 4.2.0)
##  zip            2.2.0      2021-05-31 [2] RSPM (R 4.2.0)
## 
##  [1] /home/lluis/bin/R/4.2.1
##  [2] /usr/lib/R/site-library
##  [3] /usr/lib/R/library
## 
## ----------------------------------------------------------------------------------------------------------------------&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Upgrading rtweet to 1.0.2</title>
      <link>https://llrs.dev/post/2022/07/04/rtweet-1-0-0/</link>
      <pubDate>Mon, 04 Jul 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/07/04/rtweet-1-0-0/</guid>
      <description>


&lt;p&gt;In this post I will provide some examples of what has changed between rtweet 0.7.0 and rtweet 1.0.2.
I hope both the changes and this guide will help all users.
I highlight the most important and interesting changes in this blog post, and for a full list of changes you can consult it on the &lt;a href=&#34;https://docs.ropensci.org/rtweet/news/index.html&#34;&gt;NEWS&lt;/a&gt;.&lt;/p&gt;
&lt;div id=&#34;big-breaking-changes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Big breaking changes&lt;/strong&gt;&lt;/h2&gt;
&lt;div id=&#34;more-consistent-output&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;More consistent output&lt;/h3&gt;
&lt;p&gt;This is probably what will affect the most users.
All functions that return data about tweets&lt;a href=&#34;#fn1&#34; class=&#34;footnote-ref&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt; now return the same columns.&lt;/p&gt;
&lt;p&gt;For example if we search some tweets we’ll get the following columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; tweets &amp;lt;- search_tweets(&amp;quot;weather&amp;quot;)
&amp;gt; colnames(tweets)
 [1] &amp;quot;created_at&amp;quot;                    &amp;quot;id&amp;quot;                           
 [3] &amp;quot;id_str&amp;quot;                        &amp;quot;full_text&amp;quot;                    
 [5] &amp;quot;truncated&amp;quot;                     &amp;quot;display_text_range&amp;quot;           
 [7] &amp;quot;entities&amp;quot;                      &amp;quot;metadata&amp;quot;                     
 [9] &amp;quot;source&amp;quot;                        &amp;quot;in_reply_to_status_id&amp;quot;        
[11] &amp;quot;in_reply_to_status_id_str&amp;quot;     &amp;quot;in_reply_to_user_id&amp;quot;          
[13] &amp;quot;in_reply_to_user_id_str&amp;quot;       &amp;quot;in_reply_to_screen_name&amp;quot;      
[15] &amp;quot;geo&amp;quot;                           &amp;quot;coordinates&amp;quot;                  
[17] &amp;quot;place&amp;quot;                         &amp;quot;contributors&amp;quot;                 
[19] &amp;quot;is_quote_status&amp;quot;               &amp;quot;retweet_count&amp;quot;                
[21] &amp;quot;favorite_count&amp;quot;                &amp;quot;favorited&amp;quot;                    
[23] &amp;quot;retweeted&amp;quot;                     &amp;quot;lang&amp;quot;                         
[25] &amp;quot;quoted_status_id&amp;quot;              &amp;quot;quoted_status_id_str&amp;quot;         
[27] &amp;quot;quoted_status&amp;quot;                 &amp;quot;possibly_sensitive&amp;quot;           
[29] &amp;quot;retweeted_status&amp;quot;              &amp;quot;text&amp;quot;                         
[31] &amp;quot;favorited_by&amp;quot;                  &amp;quot;scopes&amp;quot;                       
[33] &amp;quot;display_text_width&amp;quot;            &amp;quot;quoted_status_permalink&amp;quot;      
[35] &amp;quot;quote_count&amp;quot;                   &amp;quot;timestamp_ms&amp;quot;                 
[37] &amp;quot;reply_count&amp;quot;                   &amp;quot;filter_level&amp;quot;                 
[39] &amp;quot;query&amp;quot;                         &amp;quot;withheld_scope&amp;quot;               
[41] &amp;quot;withheld_copyright&amp;quot;            &amp;quot;withheld_in_countries&amp;quot;        
[43] &amp;quot;possibly_sensitive_appealable&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;rtweet now minimizes the processing of tweets and only returns the same data as provided by the API while making it easier to handle by R.
However, to preserve the nested nature of the data returned some fields are now nested inside other.
For example, previously fields &lt;code&gt;&#34;bbpx_coords&#34;&lt;/code&gt;, &lt;code&gt;&#34;geo_coords&#34;&lt;/code&gt;, &lt;code&gt;&#34;coords_coords&#34;&lt;/code&gt; were returned as separate columns, but they are now nested inside &lt;code&gt;&#34;place&#34;&lt;/code&gt;, &lt;code&gt;&#34;coordinates&#34;&lt;/code&gt; or &lt;code&gt;&#34;geo&#34;&lt;/code&gt; depending where they are provided.
Some columns previously calculated by rtweet are now not returned, like &lt;code&gt;&#34;rtweet_favorite_count&#34;&lt;/code&gt;.
At the same time it provides with new columns about each tweet like the &lt;code&gt;&#34;withheld_*&#34;&lt;/code&gt; columns.&lt;/p&gt;
&lt;p&gt;If you scanned through the columns you might have noticed that columns &lt;code&gt;&#34;user_id&#34;&lt;/code&gt; and &lt;code&gt;&#34;screen_name&#34;&lt;/code&gt; are no longer returned.
This data is still returned by the API but it is now made available to the rtweet users via &lt;code&gt;users_data()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; colnames(users_data(tweets))
 [1] &amp;quot;id&amp;quot;                      &amp;quot;id_str&amp;quot;                 
 [3] &amp;quot;name&amp;quot;                    &amp;quot;screen_name&amp;quot;            
 [5] &amp;quot;location&amp;quot;                &amp;quot;description&amp;quot;            
 [7] &amp;quot;url&amp;quot;                     &amp;quot;protected&amp;quot;              
 [9] &amp;quot;followers_count&amp;quot;         &amp;quot;friends_count&amp;quot;          
[11] &amp;quot;listed_count&amp;quot;            &amp;quot;created_at&amp;quot;             
[13] &amp;quot;favourites_count&amp;quot;        &amp;quot;verified&amp;quot;               
[15] &amp;quot;statuses_count&amp;quot;          &amp;quot;profile_image_url_https&amp;quot;
[17] &amp;quot;profile_banner_url&amp;quot;      &amp;quot;default_profile&amp;quot;        
[19] &amp;quot;default_profile_image&amp;quot;   &amp;quot;withheld_in_countries&amp;quot;  
[21] &amp;quot;derived&amp;quot;                 &amp;quot;withheld_scope&amp;quot;         
[23] &amp;quot;entities&amp;quot; &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This blog post should help you find the right data columns, but if you don’t find what you are looking for it might be nested inside a column.&lt;br /&gt;
Try using &lt;code&gt;dplyr::glimpse()&lt;/code&gt; to explore the data and locate nested columns.
For example the entities column (which is present in both tweets and users) have the following useful columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; names(tweets$entities[[1]])
[1] &amp;quot;hashtags&amp;quot;      &amp;quot;symbols&amp;quot;       &amp;quot;user_mentions&amp;quot; &amp;quot;urls&amp;quot;         
[5] &amp;quot;media&amp;quot; &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Similarly if you look up a user via &lt;code&gt;search_users()&lt;/code&gt; or &lt;code&gt;lookup_users()&lt;/code&gt; you’ll get consistent data:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; users &amp;lt;- lookup_users(c(&amp;quot;twitter&amp;quot;, &amp;quot;rladiesglobal&amp;quot;, &amp;quot;_R_Foundation&amp;quot;))
&amp;gt; colnames(users)
 [1] &amp;quot;id&amp;quot;                      &amp;quot;id_str&amp;quot;                 
 [3] &amp;quot;name&amp;quot;                    &amp;quot;screen_name&amp;quot;            
 [5] &amp;quot;location&amp;quot;                &amp;quot;description&amp;quot;            
 [7] &amp;quot;url&amp;quot;                     &amp;quot;protected&amp;quot;              
 [9] &amp;quot;followers_count&amp;quot;         &amp;quot;friends_count&amp;quot;          
[11] &amp;quot;listed_count&amp;quot;            &amp;quot;created_at&amp;quot;             
[13] &amp;quot;favourites_count&amp;quot;        &amp;quot;verified&amp;quot;               
[15] &amp;quot;statuses_count&amp;quot;          &amp;quot;profile_image_url_https&amp;quot;
[17] &amp;quot;profile_banner_url&amp;quot;      &amp;quot;default_profile&amp;quot;        
[19] &amp;quot;default_profile_image&amp;quot;   &amp;quot;withheld_in_countries&amp;quot;  
[21] &amp;quot;derived&amp;quot;                 &amp;quot;withheld_scope&amp;quot;         
[23] &amp;quot;entities&amp;quot;               &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can use &lt;code&gt;tweets_data()&lt;/code&gt; to retrieve information about their latest tweet:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; colnames(tweets_data(users))
 [1] &amp;quot;created_at&amp;quot;                    &amp;quot;id&amp;quot;                           
 [3] &amp;quot;id_str&amp;quot;                        &amp;quot;text&amp;quot;                         
 [5] &amp;quot;truncated&amp;quot;                     &amp;quot;entities&amp;quot;                     
 [7] &amp;quot;source&amp;quot;                        &amp;quot;in_reply_to_status_id&amp;quot;        
 [9] &amp;quot;in_reply_to_status_id_str&amp;quot;     &amp;quot;in_reply_to_user_id&amp;quot;          
[11] &amp;quot;in_reply_to_user_id_str&amp;quot;       &amp;quot;in_reply_to_screen_name&amp;quot;      
[13] &amp;quot;geo&amp;quot;                           &amp;quot;coordinates&amp;quot;                  
[15] &amp;quot;place&amp;quot;                         &amp;quot;contributors&amp;quot;                 
[17] &amp;quot;is_quote_status&amp;quot;               &amp;quot;retweet_count&amp;quot;                
[19] &amp;quot;favorite_count&amp;quot;                &amp;quot;favorited&amp;quot;                    
[21] &amp;quot;retweeted&amp;quot;                     &amp;quot;lang&amp;quot;                         
[23] &amp;quot;retweeted_status&amp;quot;              &amp;quot;possibly_sensitive&amp;quot;           
[25] &amp;quot;quoted_status&amp;quot;                 &amp;quot;display_text_width&amp;quot;           
[27] &amp;quot;user&amp;quot;                          &amp;quot;full_text&amp;quot;                    
[29] &amp;quot;favorited_by&amp;quot;                  &amp;quot;scopes&amp;quot;                       
[31] &amp;quot;display_text_range&amp;quot;            &amp;quot;quoted_status_id&amp;quot;             
[33] &amp;quot;quoted_status_id_str&amp;quot;          &amp;quot;quoted_status_permalink&amp;quot;      
[35] &amp;quot;quote_count&amp;quot;                   &amp;quot;timestamp_ms&amp;quot;                 
[37] &amp;quot;reply_count&amp;quot;                   &amp;quot;filter_level&amp;quot;                 
[39] &amp;quot;metadata&amp;quot;                      &amp;quot;query&amp;quot;                        
[41] &amp;quot;withheld_scope&amp;quot;                &amp;quot;withheld_copyright&amp;quot;           
[43] &amp;quot;withheld_in_countries&amp;quot;         &amp;quot;possibly_sensitive_appealable&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can merge them via:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;users_and_last_tweets &amp;lt;- cbind(users, id_str = tweets_data(users)[, &amp;quot;id_str&amp;quot;])&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the future (&lt;a href=&#34;#future&#34;&gt;see below&lt;/a&gt;), with helper functions managing the output of rtweet will become easier.&lt;/p&gt;
&lt;p&gt;Finally, &lt;code&gt;get_followers()&lt;/code&gt; and &lt;code&gt;get_friends()&lt;/code&gt; now return the same columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;&amp;gt; colnames(get_followers(&amp;quot;_R_Foundation&amp;quot;))
[1] &amp;quot;from_id&amp;quot; &amp;quot;to_id&amp;quot;  
&amp;gt; colnames(get_friends(&amp;quot;_R_Foundation&amp;quot;))
[1] &amp;quot;from_id&amp;quot; &amp;quot;to_id&amp;quot;  &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will make it easier to build networks of connections (although you might want to convert screen names to ids or vice versa).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;more-consistent-interface&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;More consistent interface&lt;/h3&gt;
&lt;p&gt;All paginated functions that don’t return tweets now use a consistent pagination interface (except the premium endpoints).
They all store the “next cursor” in an &lt;code&gt;rtweet_cursor&lt;/code&gt; attribute, which will be automatically retrieved when you use the &lt;code&gt;cursor&lt;/code&gt; argument.
This will make it easier to continue a query you started:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;users &amp;lt;- get_followers(&amp;quot;_R_Foundation&amp;quot;)
users
     
# use `cursor` to find the next &amp;quot;page&amp;quot; of results
more_users &amp;lt;- get_followers(&amp;quot;_R_Foundation&amp;quot;, cursor = users)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;They support &lt;code&gt;max_id&lt;/code&gt; and &lt;code&gt;since_id&lt;/code&gt; to find earlier and later tweets respectively:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Retrieve all the tweets made since the previous request
newer &amp;lt;- search_tweets(&amp;quot;weather&amp;quot;, since_id = tweets)
# Retrieve tweets made before the previous request
older &amp;lt;- search_tweets(&amp;quot;weather&amp;quot;, max_id = tweets)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want more tweets than it is allowed by the rate limits of the API, you can use &lt;code&gt;retryonratelimit&lt;/code&gt; to wait as long as needed:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;long &amp;lt;- search_tweets(&amp;quot;weather&amp;quot;, n = 1000, retryonratelimit = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will keep busy your terminal until the 1000 tweets are retrieved.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;saving-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Saving data&lt;/h3&gt;
&lt;p&gt;An unexpected consequence of returning more data (now matching that returned by the API) is that it is harder to save it in a tabular format.
For instance one tweet might have one media, mention two users and have three hashtags.
There isn’t a simple way to save it in a single row uniformly for all tweets or
it could lead to confusion.&lt;/p&gt;
&lt;p&gt;This resulted in deprecating &lt;code&gt;save_as_csv&lt;/code&gt;, &lt;code&gt;read_twitter_csv&lt;/code&gt; and related functions because they don’t work with the new data structure and it won’t be possible to load the complete data from a csv.
They will be removed in later versions.&lt;/p&gt;
&lt;p&gt;Many users will benefit from saving to RDS (e.g., &lt;code&gt;saveRDS()&lt;/code&gt; or &lt;code&gt;readr::write_rds()&lt;/code&gt;), and those wanting to export to tabular format can simplify the data to include only that of interest before saving with generic R functions (e.g., &lt;code&gt;write.csv()&lt;/code&gt; or &lt;code&gt;readr::write_csv()&lt;/code&gt;).&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;other-breaking-changes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Other breaking changes&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Accessibility is important and for this reason if you tweet via &lt;code&gt;post_tweet()&lt;/code&gt; and add an image, gif or video you’ll need to provide the media alternative text.
Without &lt;code&gt;media_alt_text&lt;/code&gt; it will not allow you to post.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;tweet_shot()&lt;/code&gt; has been deprecated as it no longer works correctly.
It might be possible to bring it back, but the code is complex and I do not understand enough to maintain it.
If you’re interested in seeing this feature return, checkout the discussion about this &lt;a href=&#34;https://github.com/ropensci/rtweet/issues/458&#34;&gt;issue&lt;/a&gt; and let me know if you have any suggestions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;rtweet also used to provide functions for data on &lt;code&gt;emojis&lt;/code&gt;, &lt;code&gt;langs&lt;/code&gt; and &lt;code&gt;stopwordslangs&lt;/code&gt;.
These are useful resources for text mining in general - not only in tweets - however they need to be updated to be helpful and would be better placed in other packages, for instance emojis is now on the &lt;a href=&#34;https://cran.r-project.org/package=bdpar&#34;&gt;bdpar package&lt;/a&gt;.
Therefore they are no longer available in rtweet.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The functions like &lt;code&gt;suggested_*()&lt;/code&gt; have been removed as they have been broken since 2019.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;easier-authentication&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Easier authentication&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;An exciting part of this release has been a big rewrite of the authentication protocol.
While it is compatible with previous rtweet authentication methods it has also some important new functions which make it easier to work with rtweet and the twitter API in different ways.&lt;/p&gt;
&lt;div id=&#34;different-ways-to-authenticate&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Different ways to authenticate&lt;/h3&gt;
&lt;p&gt;If you just want to test the package, use the default authentication &lt;code&gt;auth_setup_default()&lt;/code&gt; that comes with rtweet.
If you use it for one or two days you won’t notice any problem.&lt;/p&gt;
&lt;p&gt;If you want to use the package for more than a couple of days, I recommend you set up your own token via &lt;code&gt;rtweet_user()&lt;/code&gt;.
It will open a window to authenticate via the authenticated account in your default browser.
This authentication won’t allow you to do everything but it will avoid running out of requests and being rate-limited.&lt;/p&gt;
&lt;p&gt;If you plan to make heavy use of the package, I recommend registering yourself as developer and using one of the following two mechanisms, depending on your plans:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Collect data and analyze: &lt;code&gt;rtweet_app()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Set up a bot: &lt;code&gt;rtweet_bot()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Find more information in the &lt;a href=&#34;https://docs.ropensci.org/rtweet/articles/auth.html&#34;&gt;Authentication with rtweet vignette&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;storing-credentials&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Storing credentials&lt;/h3&gt;
&lt;p&gt;Previously rtweet saved each token created, but now non-default tokens are only saved if you ask. You can save them manually via &lt;code&gt;auth_save(token, &#34;my_app&#34;)&lt;/code&gt;.
Bonus, if you name your token as default (&lt;code&gt;auth_save(token, &#34;default&#34;)&lt;/code&gt;) it will be used automatically upon loading the library.&lt;/p&gt;
&lt;p&gt;Further, tokens are now saved in the location output by &lt;code&gt;tools::R_user_dir(&#34;rtweet&#34;, &#34;config&#34;)&lt;/code&gt;, rather than in your home directory.
If you have previous tokens saved or problems identifying which token is which use &lt;code&gt;auth_sitrep()&lt;/code&gt;.
This will provides clues to which tokens might be duplicated or misconfigured but it won’t check if they work.
It will also automatically move your tokens to the new path.&lt;/p&gt;
&lt;p&gt;To check which credentials you have stored use &lt;code&gt;auth_list()&lt;/code&gt; and load them via &lt;code&gt;auth_as(&#34;my_app&#34;)&lt;/code&gt;.
All the rtweet functions will use the latest token loaded with &lt;code&gt;auth_as&lt;/code&gt; (unless you manually specify one when calling it).
If you are not sure which token you are using you can use &lt;code&gt;auth_get()&lt;/code&gt; it will return the token in use, list them or ask you to authenticate.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;other-changes-of-note&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Other changes of note&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;This is a list of other changes that aren’t too big or are not breaking changes but are worthy enough of a mention:&lt;/p&gt;
&lt;div id=&#34;iteration-and-continuation-of-requests&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Iteration and continuation of requests&lt;/h3&gt;
&lt;p&gt;Using cursors, pagination or waiting until you can make more queries is now easier.
For example you can continue previous requests via:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;users &amp;lt;- get_followers(&amp;quot;_R_Foundation&amp;quot;)
users

# use `cursor` to find the next &amp;quot;page&amp;quot; of results
more_users &amp;lt;- get_followers(&amp;quot;_R_Foundation&amp;quot;, cursor = users)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;additions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Additions&lt;/h3&gt;
&lt;p&gt;There is now a function to find a thread of a user.
You can start from any tweet and it will find all the tweets of the thread:
&lt;code&gt;tweet_threading(&#34;1461776330584956929&#34;)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There is a lot of interest in downloading and keeping track of interactions on Twitter.
The amount of interest is big enough that Twitter is releasing a new API to provide more information of this nature.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;future&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Future&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Twitter API v2 is being released and soon it will replace API v1.
rtweet up to now, including this release, uses API v1 so it will need to adapt to the new endpoints and new data returned.&lt;/p&gt;
&lt;p&gt;First will be the streaming endpoints in November, so expect more (breaking?) changes around those dates if not earlier.&lt;/p&gt;
&lt;p&gt;I would also like to make it easier for users, dependencies and the package itself to handle the outputs.
To this regard I would like to provide some classes to handle the different type of objects it returns.&lt;/p&gt;
&lt;p&gt;This will help avoid some of the current shortcomings.
Specifically I would like to provide functions to make it easier to reply to previous tweets,
extract nested data, and subset tweets and the accompanying user information.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusions&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Conclusions&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;While I made many breaking changes I hope these changes will smooth future development and help both users and maintainers.&lt;/p&gt;
&lt;p&gt;Feel free to ask on the &lt;a href=&#34;https://discuss.ropensci.org/tag/rtweet&#34;&gt;rOpenSci community&lt;/a&gt; if you have questions about the transition or find something amiss.
Please let me know! It will help me prioritize which endpoints are more relevant to the community.
(And yes, the academic archive endpoint is on the radar.)&lt;/p&gt;
&lt;p&gt;It is also possible that I overlooked something and I thought the code is working when it isn’t.
For example, after several months of changing the way the API is parsed, several users found it wasn’t handling some elements.
Let me know of such or similar cases and I’ll try to fix it.&lt;/p&gt;
&lt;p&gt;In case you find a bug, check the open issues and if it has not already been reported, open an &lt;a href=&#34;https://github.com/ropensci/rtweet/issues/&#34;&gt;issue on GitHub&lt;/a&gt;.
Don’t forget to make a &lt;a href=&#34;https://cran.r-project.org/web/packages/reprex/readme/README.html&#34;&gt;reprex&lt;/a&gt; and if possible provide the id of the tweets you are having trouble with.
Unfortunately it has happened that when I came to look at a bug I couldn’t reproduce it as I wasn’t able to find the tweet which caused the error.&lt;/p&gt;
&lt;p&gt;This release includes contributions from Hadely Wicham, Bob Rudis, Alex Hayes, Simon Heß, Diego Hernán, Michael Chirico, Jonathan Sidi, Jon Harmon, Andrew Fraser and many other that reported bugs or provided feedback.
Many thanks all for using it, your interest to keep it working and improving rtweet for all.&lt;/p&gt;
&lt;p&gt;Finally, you can read the whole &lt;a href=&#34;https://docs.ropensci.org/rtweet/news/index.html&#34;&gt;NEWS online&lt;/a&gt; and the examples.&lt;/p&gt;
&lt;p&gt;Happy tweeting!&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgements&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Acknowledgements&lt;/h2&gt;
&lt;p&gt;This is a repost of the &lt;a href=&#34;https://ropensci.org/blog/2022/07/21/rtweet-1-0-0/&#34;&gt;entry for rOpenSci&lt;/a&gt;.
The post was edited and improved by Yanina Bellini Saibene and Steffi LaZerte, the community manager and assistant. Many thanks&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&#34;footnotes footnotes-end-of-document&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn1&#34;&gt;&lt;p&gt;Specifically these: &lt;code&gt;get_favorites()&lt;/code&gt;, &lt;code&gt;get_favorites_user()&lt;/code&gt;, &lt;code&gt;get_mentions()&lt;/code&gt;,
&lt;code&gt;get_my_timeline()&lt;/code&gt;, &lt;code&gt;get_retweets()&lt;/code&gt;, &lt;code&gt;get_timeline()&lt;/code&gt;, &lt;code&gt;get_timeline_user()&lt;/code&gt;,
&lt;code&gt;lists_statuses()&lt;/code&gt;, &lt;code&gt;lookup_statuses()&lt;/code&gt;, &lt;code&gt;lookup_tweets()&lt;/code&gt;, &lt;code&gt;search_30day()&lt;/code&gt;,
&lt;code&gt;search_fullarchive()&lt;/code&gt;, &lt;code&gt;search_tweets()&lt;/code&gt;, &lt;code&gt;tweet_shot()&lt;/code&gt; and &lt;code&gt;tweet_threading()&lt;/code&gt;.&lt;a href=&#34;#fnref1&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Writing a thesis with bookdown</title>
      <link>https://llrs.dev/post/2022/05/09/writing-thesis-bookdown/</link>
      <pubDate>Mon, 09 May 2022 00:00:00 +0000</pubDate>
      <guid>https://llrs.dev/post/2022/05/09/writing-thesis-bookdown/</guid>
      <description>


&lt;p&gt;On this post I am documenting the experiences I had writing my &lt;a href=&#34;https://thesis.llrs.dev&#34;&gt;PhD thesis&lt;/a&gt; with bookdown.
I made the thesis in web and pdf format (and epub) to make more available the thesis.
Most of the experiences and advise I’ll share here are based on my experiences to improve the pdf format.
It is the most important format as ultimately is what I’m going to use for printing.&lt;/p&gt;
&lt;p&gt;First of all you should know there is a package &lt;a href=&#34;https://github.com/ismayc/thesisdown&#34;&gt;thesisdown&lt;/a&gt; with a few templates for some universities.
If yours is there, or if you want to learn how are they you can have a look at the files.&lt;br /&gt;
On my case I didn’t have any template and the university guidelines are not long, (have two compulsory pages at the beginning and have 5 sections).
That’s why I tweaked the default format inspired by the recent thesis defended on my group.&lt;/p&gt;
&lt;p&gt;Well, without further delay let’s dive in things I learned:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#captions&#34;&gt;Captions&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#chapter-thumb&#34;&gt;Chapter thumb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#placing-options&#34;&gt;Placing options&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#index&#34;&gt;Indexes&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#toft&#34;&gt;Table of figures and tables&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#acronyms&#34;&gt;Acronyms&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#placing-floats&#34;&gt;Placing floats&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#github-actions&#34;&gt;Github actions&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#empty-pages&#34;&gt;Empty pages&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#title-pages&#34;&gt;Title pages&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#page-numbers&#34;&gt;Page numbers&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#include-pdfs&#34;&gt;Include pdfs&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#running-titles&#34;&gt;Running titles&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#merging-pdfs&#34;&gt;Merging pdfs&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;#reducing-pdf-size&#34;&gt;Reducing pdf size&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div id=&#34;important&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Important&lt;/h2&gt;
&lt;div id=&#34;captions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Captions&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;knitr::kable&lt;/code&gt; places the captions on tables at the top (by design, see &lt;a href=&#34;https://github.com/yihui/knitr/issues/1189&#34;&gt;issue #1189&lt;/a&gt;), while knitr places the captions on the bottom of figures.
So if you want to have all the captions below the element you’ll need to use a different package for it (&lt;code&gt;booktable&lt;/code&gt;, or others).&lt;/p&gt;
&lt;p&gt;If you want to have short captions for an easy readable table of figures and table of tables you’ll need to use &lt;code&gt;kable(short.caption = &#34;TOC&#34;, caption = &#34;Long caption below the table&#34;).&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;In addition, on &lt;code&gt;kable&lt;/code&gt; if you use something like Häsler you’ll need to convert this “ä” to “\u00E4”.&lt;/p&gt;
&lt;p&gt;I also wanted to highlight and differentiate the captions.
I ended up using the &lt;code&gt;caption&lt;/code&gt; package:&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{caption}
% Set in bold the numbering of tables and chapters
\captionsetup{labelfont=bf,width=\textwidth}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;\textwidh&lt;/code&gt; is to make more with the captions otherwise they just spans the size of the table or figure.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;repeating-text&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Repeating text&lt;/h3&gt;
&lt;p&gt;If you find yourself repeating some text to explain some figures, legends or tables you can use &lt;a href=&#34;https://bookdown.org/yihui/bookdown/markdown-extensions-by-bookdown.html#text-references&#34;&gt;text references&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(ref:foo) Define a text reference **here**. &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then you can use &lt;code&gt;(ref:foo)&lt;/code&gt; to repeat the same text.&lt;/p&gt;
&lt;p&gt;Although formatting cannot be applied afterwards (i.e. &lt;code&gt;**(ref:foo)**&lt;/code&gt;) it is handy to just write once and avoid repetition (And also if to keep backwards compatibility you can’t use the new special comment &lt;code&gt;#|&lt;/code&gt; to specify chunk options).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;placing-options&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Placing options&lt;/h3&gt;
&lt;p&gt;Many latex instructions go to the &lt;code&gt;index.Rmd&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;The once I included are:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;split_by: chapter
link-citations: true 
always_allow_html: true
colorlinks: yes
# https://bookdown.org/yihui/rmarkdown-cookbook/latex-variables.html
# links-as-notes: true # Only activate for actual printing
fontfamily: libertine
fontsize: 12pt
papersize: a4 # The printed size of the thesis
acronyms:
  loa_title: &amp;quot;&amp;quot;
  insert_loa: false
  sorting: usage
  include_unused: false
  fromfile: ./style/acronyms.yml
geometry:
 - top=25.4mm
 - bottom=25.4mm
 - left=25.4mm
 - right=25.4mm
 - bindingoffset=6.4mm
 - asymmetric
classoption: 
  - twoside
  - openright
lot: yes
lof: yes&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;split_by&lt;/code&gt; in the html format how to move to the next section.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;link-citations&lt;/code&gt; Add a link to the citation?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;colorlinks&lt;/code&gt; If links should have a color&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;links-as-notes&lt;/code&gt; Instead of having hyperlinks have them included as notes.
It is useful for printing where the reader doesn’t have the option to click a link but might be interested in knowing more.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;fontfamily&lt;/code&gt; and &lt;code&gt;fontsize&lt;/code&gt; decide which font and size will be used.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;papersize&lt;/code&gt; this chooses the available space and greatly affects the position of figures and tables, which can float on the text according to LaTeX algorithm.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;acronyms&lt;/code&gt; Configuration of the &lt;a href=&#34;#acronyms&#34;&gt;acronyms&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;geometry&lt;/code&gt; Defines the margins, consider that on books the central zone will not be readable.
The &lt;code&gt;bindingoffset&lt;/code&gt; adds some space to make it easier reading.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;classoption&lt;/code&gt; Options for the book format&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;lot&lt;/code&gt; and &lt;code&gt;lof&lt;/code&gt; indicate if list of tables (lot) and list of figures (lof) should be included on the pdf output.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;book_filename&lt;/code&gt; if present on index.Rmd is overwritten by what is on &lt;code&gt;_bookdown.yml&lt;/code&gt; but be careful also on what goes to &lt;code&gt;_bookdown.yml&lt;/code&gt; and on the specific format on &lt;code&gt;_output.yml&lt;/code&gt; .&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;nice-little-tricks&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Nice little tricks&lt;/h2&gt;
&lt;div id=&#34;dedication&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Dedication&lt;/h3&gt;
&lt;p&gt;Looking at the source code of the &lt;a href=&#34;https://github.com/rstudio/bookdown/blob/main/inst/examples/latex/before_body.tex&#34;&gt;bookdown book&lt;/a&gt; I found that the correct way was to use before_body option.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;chapter-thumb&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Chapter thumb&lt;/h3&gt;
&lt;p&gt;One thing I liked from other thesis is the ability to have on the printed edition a little mark on the side of the page to find a section.&lt;/p&gt;
&lt;p&gt;My first search showed that it &lt;a href=&#34;https://tex.stackexchange.com/questions/113323/how-can-one-put-a-marker-to-every-page-in-a-chapter&#34;&gt;was possible&lt;/a&gt;, but I didn’t want to load the &lt;code&gt;tikz&lt;/code&gt; package.
I ended up using &lt;a href=&#34;https://tex.stackexchange.com/questions/262950/modify-chapter-thumb-for-appendix&#34;&gt;this solution&lt;/a&gt; after adding and modifying the colors, changing the size and position.&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage[scale=1,angle=0,opacity=1,contents={}]{background}
\usetikzlibrary{calc}
\usepackage{ifthen}
\usepackage{lipsum}
% auxiliary counter
\newcounter{chapshift}
% the list of colors to be used (add more if needed)
\newcommand\BoxColor{%
  \ifcase\thechapshift blue!30\or red!30\or olive!30\or magenta!30\or teal!30\or lime!30\or orange!30\or violet!30\or brown!30\else yellow!30\fi}
% the main command; the mandatory argument sets the color of the vertical box
\newcommand\ChapFrame{%
  \def\TitleText{\leftmark}%
  \AddEverypageHook{%
    \ifthenelse{\isodd{\value{page}}}
      {\backgroundsetup{
        contents={%
          \begin{tikzpicture}[overlay,remember picture]
          \node[fill=\BoxColor,inner sep=0pt,rectangle,text width=1cm,
            text height=3cm,align=center,anchor=north east]
          at ($ (current page.north east) + (-0cm,- 3*\thechapshift cm) $)
          % {\rotatebox{90}{\parbox{4cm}{%
          %   \centering\textcolor{black}{\scshape\thechapshift}}}};
          {};
          \end{tikzpicture}
        }%
      }
    }
    {\backgroundsetup{
      contents={%
        \begin{tikzpicture}[overlay,remember picture]
        \node[fill=\BoxColor,inner sep=0pt,rectangle,text width=1cm,
          text height=3cm,align=center,anchor=north west]
        at ($ (current page.north west) + (-0cm,-3*\thechapshift cm) $)
        % {\rotatebox{90}{\parbox{4cm}{%
        %   \centering\textcolor{black}{\scshape\thechapshift}}}};
        {};
        \end{tikzpicture}
      }
    }
  }
\BgMaterial}%
\stepcounter{chapshift}
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code basically means that I need to add &lt;code&gt;\ChapFrame&lt;/code&gt; when I want the chapter thumb (I didn’t know the name before searching this).
Once started it changes colors according to &lt;code&gt;\chapshift&lt;/code&gt; which is automatically incremented by &lt;code&gt;\ChapFrame&lt;/code&gt;.
I also set to change position according to &lt;code&gt;\thechapshift&lt;/code&gt; so that they make a stair.&lt;/p&gt;
&lt;p&gt;The code basically changes the position of the mark if the page is even or odd, so that it is always on the outer side of the booklet.
The size is &lt;code&gt;width=1cm, height=3cm&lt;/code&gt; with text inside it.
If you want text I recommend either short titles or the chapter number &lt;code&gt;\chapter&lt;/code&gt; to ensure it is readable.&lt;/p&gt;
&lt;p&gt;A tiny trick I learned was to reset the counter with &lt;code&gt;\afterpage{\setcounter{chapshift}{0}}&lt;/code&gt; after the bibliography so that the appendix would use the same mark from the beginning.
If you want different colors for the appendix you could just create a new counter and a new &lt;code&gt;\BoxColor&lt;/code&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;index&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Index with index on it&lt;/h3&gt;
&lt;p&gt;I wanted to have the table of contents to show were it began, simply because with all the added page on the front and white pages it might be hard to find it.
It is also handy when using the outline of the pdf version to go back the the index to then move to another section.&lt;/p&gt;
&lt;p&gt;Simply loading the &lt;code&gt;\tocbibind&lt;/code&gt; packages was enough:&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{tocbibind}&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/screenshot_index.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Screenshot of the outline and the real index.
The outline has the same content as the real index.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Note: I found a “bug” were the appendix link goes to the bibliography (the previous chapter) instead of the correctly displayed page.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;toft&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Table of figures and tables&lt;/h3&gt;
&lt;p&gt;It was not required but I wanted a table of tables and a table of figures, to make it easier go to results of the thesis.
To add them I used the &lt;code&gt;tocbibind&lt;/code&gt; package that automatically adds it (and the options on &lt;code&gt;index.Rmd&lt;/code&gt; wasn’t sure from the &lt;a href=&#34;https://tex.stackexchange.com/a/48512/178206&#34;&gt;answer&lt;/a&gt; I found online).&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{tocbibind}&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/lof.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Screenshot of the first lines of the list of figures, with a short caption for each figure&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;acronyms&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Acronyms&lt;/h3&gt;
&lt;p&gt;I repeat many acronyms on the thesis and I wanted to have a brief table with them.
I used the &lt;a href=&#34;https://github.com/rchaput/acronymsdown&#34;&gt;package acronymsdown&lt;/a&gt; which is simple, easy and works well for web and pdf.&lt;br /&gt;
My only wish is that it had a way to go back to were the reader was.&lt;/p&gt;
&lt;p&gt;To place the acronyms were I wanted I had to remove the automatic title and use the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Glossary {-}

\printacronyms&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I add them to the beginning after the summaries of the thesis and the preface, right before the body of the thesis.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;placing-floats&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Placing floats&lt;/h3&gt;
&lt;p&gt;I included many figures and tables which makes it hard to have all of them near where they are added on the text.
While it can be forced, I didn’t want that but neither I wanted them too far away.&lt;/p&gt;
&lt;p&gt;To avoid them going after the subsection I added this command before the title of the next subsection:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;\FloatBarrier&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Probably it could be done automatically renewing the subsection title format, but as I only had to do this 5 times is manageable.&lt;/p&gt;
&lt;p&gt;Following an &lt;a href=&#34;https://stackoverflow.com/a/33801326/2886003&#34;&gt;answer&lt;/a&gt;, the figure floating algorithm was set with these preferences:&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{float}
\usepackage{colortbl}
\let\origfigure\figure
\let\endorigfigure\endfigure
\renewenvironment{figure}[1][2] {
    \expandafter\origfigure\expandafter[!htbp]
} {
    \endorigfigure
}&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/figure_table_thumb.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;A figure and a table on the same page, the label is in bold and the text in italics, on the right side (the page is odd) the blue chapter thumb&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;github-actions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Github actions&lt;/h3&gt;
&lt;p&gt;To render I initially used my &lt;a href=&#34;htts://github.com/r-lib/actions&#34;&gt;r-lib/actions&lt;/a&gt; but without using any package structure.
However, once I set a DESCRIPTION file with all the package dependencies it was much faster, as I could use the &lt;a href=&#34;https://github.com/r-lib/actions/tree/v2/setup-r-dependencies&#34;&gt;setup-r-dependencies&lt;/a&gt; action.&lt;/p&gt;
&lt;p&gt;Probably there is also a faster way directly installing binaries with the help of &lt;code&gt;bspm&lt;/code&gt; or the system package manager, but this was convenient enough.&lt;/p&gt;
&lt;p&gt;Also the action to install &lt;a href=&#34;https://github.com/r-lib/actions/tree/v2/setup-tinytex&#34;&gt;tinytex&lt;/a&gt; made my job for the pdf to render much faster (from 15 minutes to 5 minutes).&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/gha.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Screenshot of the github actions (on push) were bookdown-web takes ~3 minutes, bookdown-epub ~2 minutes and bookdown-pdf ~5 minutes.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;empty-pages&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Empty pages&lt;/h3&gt;
&lt;p&gt;The chapters are on the right side of the book so they must end or have a blank page before.
To have a completely blank page I &lt;a href=&#34;https://tex.stackexchange.com/a/1684/178206&#34;&gt;found&lt;/a&gt; a simple solution, simply load a latex package &lt;code&gt;emptypage&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{emptypage}&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;title-pages&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Title pages&lt;/h3&gt;
&lt;p&gt;Where to place titles, format it with &lt;code&gt;titlesec&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{titlesec}
\titleformat{\chapter}[display]{\fontsize{32pt}{48pt}\bfseries\sffamily\filcenter}{
    \fontsize{72pt}{72pt} \thechapter \ChapFrame
} % Content on the chapter title page
{20pt}{\lsstyle}[\thispagestyle{empty} \cleardoublepage]% https://tex.stackexchange.com/a/347162/178206
\titleformat{name=\chapter, numberless}{\normalfont\huge\bfseries\filcenter}{}{20pt}{\Huge}
\titlespacing*{\chapter}{0pt}{100pt}{40pt}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first line load the package.
Then we set the format of the chapters option display and the format of text and what appears on that page.
&lt;code&gt;\thechapter&lt;/code&gt; is the title of the chapter (while &lt;code&gt;\chatper&lt;/code&gt; is the number).
&lt;code&gt;\chapFrame&lt;/code&gt; is the new command defined to set &lt;a href=&#34;#chapter-thumb&#34;&gt;chapter thumb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The other benefit this had was that the title page had no page number.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/chapter.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;A screenshot of the introduction title page.
There is a number 1 and Introduction a couple of lines below.
On the right side a red box 3 cm below the top.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;page-numbers&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Page numbers&lt;/h3&gt;
&lt;p&gt;I decided to use different style for page numbers&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage[automark,headsepline]{scrlayer-scrpage}% sets page style scrheadings automatically
\clearpairofpagestyles
\ihead{\leftmark}
\ohead*{\pagemark}
\setkomafont{pagenumber}{}% default is \normalfont\normalcolor&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;include-pdfs&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Include pdfs&lt;/h3&gt;
&lt;p&gt;As part of the appendix I added my publications on their pdf format.
To do so I used the following code modified from the &lt;a href=&#34;https://stackoverflow.com/questions/2739159/inserting-a-pdf-file-in-latex&#34;&gt;original answer&lt;/a&gt;:&lt;/p&gt;
&lt;pre class=&#34;latex&#34;&gt;&lt;code&gt;\usepackage{pdfpages} 
\includepdf[pages=-, pagecommand={}, templatesize={\textwidth}{\textheight  - 25pt}, trim=0 0 0 20pt,]{pdf/paper.pdf}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I had to tweak the size were it was placed to keep the page numbers and running titles.
The last curly brackets indicate the location of the pdf to include.&lt;/p&gt;
&lt;p&gt;The benefit of this is that these pages now are also numbered with the thesis style and the title.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;images/pdf_included.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Screenshot of the pdf included showing the page number of the thesis and the content of the article “Multi-omic modelling of inflammatory bowel disease with regularized canonical correlation analysis”&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;running-titles&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Running titles&lt;/h3&gt;
&lt;p&gt;If you have a long title such as you are probably interested on having a shorter version for it on the thesis pages.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## experDesign: stratifying samples into batches with minimal bias

\sectionmark{experDesign: paper}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I don’t think it actually made a big difference but it might we important for chapters.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;merging-pdfs&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Merging pdfs&lt;/h3&gt;
&lt;p&gt;As I said I needed some pages at the beginning of the thesis.
But I wanted to keep the outline of the pdf, and I found an &lt;a href=&#34;https://stackoverflow.com/a/19358402/2886003&#34;&gt;solution online&lt;/a&gt; explaining how to do it.&lt;/p&gt;
&lt;p&gt;To add multiple pdf I did this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;gs -q -SDEVICE=pdfwrite -DPDFSETTINGS=/prepress -o merged.pdf page1.pdf page2.pdf thesis.pdf&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;reducing-pdf-size&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Reducing pdf size&lt;/h3&gt;
&lt;p&gt;At the end the pdf was bigger than what I could send over email.&lt;/p&gt;
&lt;p&gt;I found &lt;a href=&#34;https://askubuntu.com/a/256449/270501&#34;&gt;this answer&lt;/a&gt; that helped me to reduce the size and send it.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The process of writing the thesis is usually one of the last steps on a PhD.
I recommend to write something and avoid having a blank page.
But once written you must take care of the presentation and style, this is a complete different skill than writing or research, so it can be specially exhausting.&lt;/p&gt;
&lt;p&gt;Bookdown through the preamble and body options is great for setting your style.
But if you are short of time, are tired you might benefit from working and using some of these already created solutions and just modify what you need (as I did).&lt;/p&gt;
&lt;p&gt;To finish, so that you can see the final format it is &lt;a href=&#34;https://thesis.llrs.dev/&#34;&gt;here&lt;/a&gt;.
There you can download it in pdf too to see most of these commands in action.&lt;/p&gt;
&lt;p&gt;If you are writing your thesis, enjoy, keep calm and reuse other solutions!&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
