Another day in debugging hell. You know when you ask people: “What’s wong?” and they respond with: “Nothing.”, even though there clearly is something wrong? This is how I like to think about missing error messages when there clearly should be one. The issue I encountered with my code today was the following:
(not_cool_r <- data.frame(col1 = 1:3, incomplete_colname = 4:6, incomp = 7:9))
## col1 incomplete_colname incomp
## 1 1 4 7
## 2 2 5 8
## 3 3 6 9
Now, we can access the column incomplete_colname
like this:
not_cool_r$incomplete_colname
## [1] 4 5 6
But you know what also works?
This:
not_cool_r$incomplete_colnam
## [1] 4 5 6
Or this:
not_cool_r$incomplete_colnam
## [1] 4 5 6
In fact …
not_cool_r$incomplete_colname
not_cool_r$incomplete_colnam
not_cool_r$incomplete_colna
not_cool_r$incomplete_coln
not_cool_r$incomplete_col
not_cool_r$incomplete_co
not_cool_r$incomplete_c
not_cool_r$incomplete_
not_cool_r$incomplete
not_cool_r$incomplet
not_cool_r$incomple
not_cool_r$incompl # gives column incomplete_colname until here
not_cool_r$incomp # gives column incomp from here
not_cool_r$incom # gives NULL from here because it could be one of two cols
not_cool_r$inco
not_cool_r$inc
This is the first call that will give us a different output, because it actually matches the name of column incomp
.
not_cool_r$incomp
## [1] 7 8 9
When leaving out more letters, the result will be NULL
, because this time, R
doesn’t know whether we meant the column incomplete_colname
or incomp
.
not_cool_r$incom
## NULL
Why is this a problem? In my case, I called a column that wasn’t there. This should have given me an error, but instead, R
used a different column that happened to have the same beginning like the column name I actually wanted to call.
Sure, you can get around this with clever naming. However, with various variable amd column names, you might lose track at some point. At least I didn’t think that sub_id
(subject ID) and sub_i
(subject intercept) would be a problem.
So, a foolproof way is to access the column via [[]]
, which will give you NULL
.
not_cool_r[["incomplete_colname"]]
## [1] 4 5 6
not_cool_r[["incomplete_colnam"]]
## NULL
Or use [ , ]
, which will give you an error (but don’t use []
, see here).
not_cool_r[ , "incomplete_colname"]
## [1] 4 5 6
try(not_cool_r[ , "incomplete_colnam"])
## Error in `[.data.frame`(not_cool_r, , "incomplete_colnam") :
## undefined columns selected
For you tidyverse
kids out there: The standard tidy syntax doesn’t do this …
library(tidyverse)
not_cool_r %>%
mutate(new_col = incomplete_colname * 2)
## col1 incomplete_colname incomp new_col
## 1 1 4 7 8
## 2 2 5 8 10
## 3 3 6 9 12
try(
not_cool_r %>%
mutate(new_col = incomplete_colnam * 2)
)
## Error : Objekt 'incomplete_colnam' nicht gefunden
… and tibbles will give you a warning.
not_cool_r <- tibble(col1 = 1:3, incomplete_colname = 4:6, incomp = 7:9)
not_cool_r$incomplete_colname
## [1] 4 5 6
not_cool_r$incomplete_colnam
## Warning: Unknown or uninitialised column: 'incomplete_colnam'.
## NULL
Find the .Rmd
here.