One Fine Day with the calibrate command
I've encountered some odd behavior testing the calibrate command. The behavior is the response to the command
c0 1 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16
The command attempts to set all the joint offsets to a known value for testing. The documentation describes the T_CALIBRATE ('c') command as
send the robot to calibration posture for attaching legs and fine-tuning the joint offsets. c jointIndex1 offset1 jointIndex2 offset2 ... e.g. c0 7 1 -4 2 3 8 5
This defines two forms for the command:
A single 'c', (basic command)
A 'c' followed by a offset list
When I issue the basic command
TEST_F(ftfBittleXProtocol, CALIBRATE) {
EXPECT_TRUE(on_calibrate());
EXPECT_EQ(5, response.size());
ASSERT_TRUE(on_abort());
}
I get the response:
response : 5 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
The line "calib" is in accordance with the documentation which specifies the calibration posture is set by the command. This is the result of executing the posture skill command "calib". Notice the superfluous trailing comma at the end of the offset list and that command terminates with the echo of the command token twice.
Given this, my expectation of the command response is
response : 5 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 9, 10, 11, 12, 13, 14, 15, 16,
c
c
response : end
That's not what I get, though:
response : 9 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
Disturbia.
I try changing a single offset, "c8 8":
response : 5 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
That seems right. Now I try commands in sequence: "c0 1" followed by "c8 9":
response : 5 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
...
response : 4 lines
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
The explanation for why the "calib: line, that sets the posture, is dropped is that the code recognizes that the command prior to the current command (the lastcmd) was the calibrate command so it doesn't reset the posture. This efficiency causes difficulty in response processing, i.e., the response to the same command depends on the prior command. However, it does imply that the calibration posture is somehow related to setting the calibration offsets.
But all this doesn't explain the response to setting a list of more than one offset:
c0 1 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16
response : 9 lines
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
calib
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
c
c
response : end
To understand this I need to look at how the T_CALIBRATE command is processed in the workhorse function reaction. When I looked at the code I came across the following (at line 381)
else {
// ...
char *pch;
pch = strtok(newCmd, " ,");
// ...
do { /* ... */ } while (pch != NULL);
// ...
delete[] pch;
}
Disturbia.
First, I thought that microcontroller code avoided dynamic memory allocation although it does exist (e.g., QList)
But second, strtok doesn't allocate memory, so there's nothing to delete unless there's an allocation after the do-while loop ends. There isn't one. However, the newCmd buffer is allocated during initialization.
char *newCmd = new char[BUFF_LEN + 1];
Since the do-while loop terminates when pch is NULL, the delete operator deletes a nul pointer which is allowed and benign. This leaves newCmd unaffected.
OTOH, if pch is ever non-nul the delete will do some sort of very bad thing.
Digging deeper, the body of the do-while loop must update pch in order to process the space-comma-tab separated list. It does,
do {
//...
for (byte b = 0; b < 2 && pch != NULL; b++) {
//...
pch = strtok(NULL, " ,\t");
//...
}
//...
The intent is to extract each element pair from the list, which is fine. But then we get
if (token == T_CALIBRATE) {
//...
if (lastToken != T_CALIBRATE) {
//...
strcpy(newCmd, "calib");
loadBySkillName(newCmd);
}
//...
More Disturbia.
The problem is that strok is currently using the newCmd buffer and this overwrites the contents in order to load the calibrate posture. This is done in the body of the do-while loop so the next iteration will try to execute
for (byte b = 0; b < 2 && pch != NULL; b++) {
target[b] = atoi(pch); //@@@ cast
pch = strtok(NULL, " ,\t");
inLen++;
}
using a buffer that now contains "calib". The pch pointer has the address of the 4th element of the newCmd buffer (after scanning the first 2 elements of the string "0 1 8 9...") but is actually pointing to the 'b' of "calib". The function atoi should return zero and the call to strtok return a nul. This results in a partial read of the first of the two elements.
After this the only significant action is a second printing of the servoCalib array, which explains the output I see.
There's a surprise!
A test is necessary for this sort of analysis. First I perform the baseline test
TEST_F(utfreaction, strtok_baseline)
{
char newCmd[]{ "0 1 8 9 10 11 12 13 14 15 15 16" };
char delim[]{ " ,\t" };
char* pch{ strtok(newCmd, delim) };
do {
int target[2]{};
int inLen{0};
for (byte b = 0; b < 2 && nullptr != pch; ++b) {
cout << "pch: " << setw(3) << pch
<< " atoi(pch): " << setw(3) << atoi(pch)
<< '\n';
target[b] = atoi(pch);
pch = strtok(NULL, " ,\t");
inLen++;
}
} while (nullptr != pch);
}
[ RUN ] utfreaction.strtok_baseline
pch: 0 atoi(pch): 0
pch: 1 atoi(pch): 1
pch: 8 atoi(pch): 8
pch: 9 atoi(pch): 9
pch: 10 atoi(pch): 10
pch: 11 atoi(pch): 11
pch: 12 atoi(pch): 12
pch: 13 atoi(pch): 13
pch: 14 atoi(pch): 14
pch: 15 atoi(pch): 15
pch: 15 atoi(pch): 15
pch: 16 atoi(pch): 16
[ OK ] utfreaction.strtok_baseline (0 ms)
Next is the version with strcpy,
TEST_F(utfreaction, strtok_strcpy)
{
char newCmd[]{ "0 1 8 9 10 11 12 13 14 15 15 16" };
char delim[]{ " ,\t" };
char* pch{ strtok(newCmd, delim) };
do {
int target[2]{};
int inLen{ 0 };
for (byte b = 0; b < 2 && nullptr != pch; ++b) {
cout << "pch: " << setw(3) << pch
<< " atoi(pch): " << setw(3) << atoi(pch)
<< '\n';
target[b] = atoi(pch);
pch = strtok(NULL, " ,\t");
inLen++;
}
strcpy(newCmd, "calib");
} while (nullptr != pch);
}
[ RUN ] utfreaction.strtok_strcpy
pch: 0 atoi(pch): 0
pch: 1 atoi(pch): 1
pch: b atoi(pch): 0
pch: 9 atoi(pch): 9
pch: 10 atoi(pch): 10
pch: 11 atoi(pch): 11
pch: 12 atoi(pch): 12
pch: 13 atoi(pch): 13
pch: 14 atoi(pch): 14
pch: 15 atoi(pch): 15
pch: 15 atoi(pch): 15
pch: 16 atoi(pch): 16
[ OK ] utfreaction.strtok_strcpy (16 ms)
That was unexpected. The 'b of "calib" just happens to align with the '8' of the input and allows strtok to skip over it.
This twist leaves the explanation for the output unanswered.
Any ideas?
Conclusion
The delete [] pch seems dangerous and I'd remove it.
Consider using a local buffer to process the offset list. Its size should be enough to hold 16 pairs of int8_t values plus 15 delimiters. Something like,
//...
char list[(DOF * sizeof(int8_t)) + (DOF-1)]{};
memcpy(list, newCmd, sizeof(list));
char *pch;
pch = strtok(list, " ,");
//...
Keeping in mind the input is free format though.
Edit: replace newCmd with use list variable in suggestion
Addendum: After additional thought, it's probably best to separate the list extraction from setting the posture and then setting the offsets. It frees up the newCmd buffer..
Resolution Options
I think these are the possible resolutions:
Ignore the issue and whistle a happy tune. Nobody is complaining.
Document the requirement and behavior and create a guthub issue; defer until refactoring pass. [Recommended]
Modify the code.
I think there's an easy fix that maintains the current behavior and is inexpensive. Use a local buffer to extract the offset list before setting the calibration posture and processing the offset list. This will free the newCmd buffer for setting the calibration posture and allow filtering of the offset list.
At the time the list extraction begins, we know the token is T_CALIBRATE and the length of the command in the newCmd buffer (cmdLen).
The nominal form of the offset list is
< offset list > := < joint idx > DELIMITER < offset > | < joint idx > DELIMITER < offset > < offset list > < joint idx > := < number > < offset > := < number > DELIMITER := SPACE | COMMA } TAB
The joint index should be in the range [0, DOF), i.e., zero to one less than DOF. The offset should be in the range [-120, 120] but nominally should be (-15, 15).
The offset list size is variable, i.e., the number of elements in the list is unspecified, but nominally should have no more than DOF elements. The order of the elements is unspecified as well.
We need to extract only the valid offsets from the list; we can discard invalid list elements. We only need a buffer large enough to hold the offsets for valid joints.
Given that, the simplest way I can think of to do this without a vector
bool extract_offset_list( char* buffer // source buffer , int8_t* list // offset list , const char* delimeter // token delimeters ) { char* pch{ strtok(buffer, delimeter) }; while (nullptr != pch) { int8_t idx{ int8_t(atoi(pch)) }; pch = strtok(nullptr, delimeter); int8_t offset{ (nullptr == pch) ? int8_t(127) : int8_t(atoi(pch)) }; if (0 <= idx && idx < DOF) { list[idx] = offset; } pch = strtok(nullptr, delimeter); } return true; } usage: char newCmd[]{ "0 1 8 9 10 11 12 13 14 15 15 16" }; int8_t list[DOF]{}; memset(list, 127, sizeof(list)); // default value; (unmodified) const char* delimeter{ " ,\t" }; extract_offset_list(newCmd, list, delimeter); // ...load "calib" posture if necessary // ...apply modified offsets from list
The behavior of strtok allows the implementation to handle malformed input, e.g., c0,,1,,8,,9,,... Whether that is desirable is an open question. Input syntax validation can be expensive and is not generally performed in the codebase.
The memory needed is stack based, small, and should be available. Further testing is needed since reaction is a critical function.
Edit: fix function call name
Addendum: I've submitted issue #16 on this in the OpenCatEsp32 repository