When you create an R-package, a burning question you always have is "who's using my package?".
The kind folks at Rstudio provided us with some utilities to address this question. The cranlogs r-package allows access to the daily downloads at the Rstudio mirror. R authors, myself including, ate this up like candy. For example, my package icenReg apparently has been averaging 10 downloads a day so for this year, making almost 1,000 downloads this year. Don't I feel important! Before this, my heavily downward biased estimate of use was the two emails users sent me regarding questions about the package, so it was nice to get some confirmation that there are people interested in my work.
However, it did not take much work to see there is something else going on here. I've actually got two packages on CRAN: icenReg and logconPH. It's worth noting that icenReg is targeted toward analysts working with real interval censored data, while logconPH is more theoretical statistics, perhaps interesting to those developing new algorithms in that particular topic, but not ready to actually analyze data. Yet, the downloads each day were highly correlated: r = 0.70 (and that includes outliers that don't coincide for the two time series!).
Sarcastically, maybe there's just a lot of people interested in what I have been doing and that's why the downloads are correlated? This notation is easily shown not to be the driving force. To examine this, I looked at a random obscure package: "batman". Quickly looking through the help files, this appears to be a package for corralling different possible answers for TRUE/FALSE for survey data (ie "yes", "YES", "y",..., -> TRUE). Not related at all to interval censoring (the shared application of icenReg and logconPH) or me. Yet these downloads are still correlated with 0.68 with icenReg's and 0.90 with logconPH! It's worth noting that icenReg has had 2 updates in the time period presented, while logconPH and batman have had 0.
Personally, I simply cannot believe that the majority of the people downloading my package are ambitious individuals interested in looking at standard interval censored regression models, novel algorithms for the log-concave estimator for interval censored data and turning "yes"'s into TRUE all in the same day. But the heavy correlation between extremely disjoint packages (batman and logconPH) leads me to believe that there is some mass automated downloading of packages on given days. I assume this is other mirrors getting updates from Rstudio's mirror?
Whatever it is, it sadly appears us authors need to take the download counts from cranlogs with a grain of salt.
The kind folks at Rstudio provided us with some utilities to address this question. The cranlogs r-package allows access to the daily downloads at the Rstudio mirror. R authors, myself including, ate this up like candy. For example, my package icenReg apparently has been averaging 10 downloads a day so for this year, making almost 1,000 downloads this year. Don't I feel important! Before this, my heavily downward biased estimate of use was the two emails users sent me regarding questions about the package, so it was nice to get some confirmation that there are people interested in my work.
However, it did not take much work to see there is something else going on here. I've actually got two packages on CRAN: icenReg and logconPH. It's worth noting that icenReg is targeted toward analysts working with real interval censored data, while logconPH is more theoretical statistics, perhaps interesting to those developing new algorithms in that particular topic, but not ready to actually analyze data. Yet, the downloads each day were highly correlated: r = 0.70 (and that includes outliers that don't coincide for the two time series!).
Sarcastically, maybe there's just a lot of people interested in what I have been doing and that's why the downloads are correlated? This notation is easily shown not to be the driving force. To examine this, I looked at a random obscure package: "batman". Quickly looking through the help files, this appears to be a package for corralling different possible answers for TRUE/FALSE for survey data (ie "yes", "YES", "y",..., -> TRUE). Not related at all to interval censoring (the shared application of icenReg and logconPH) or me. Yet these downloads are still correlated with 0.68 with icenReg's and 0.90 with logconPH! It's worth noting that icenReg has had 2 updates in the time period presented, while logconPH and batman have had 0.
Personally, I simply cannot believe that the majority of the people downloading my package are ambitious individuals interested in looking at standard interval censored regression models, novel algorithms for the log-concave estimator for interval censored data and turning "yes"'s into TRUE all in the same day. But the heavy correlation between extremely disjoint packages (batman and logconPH) leads me to believe that there is some mass automated downloading of packages on given days. I assume this is other mirrors getting updates from Rstudio's mirror?
Whatever it is, it sadly appears us authors need to take the download counts from cranlogs with a grain of salt.