关闭
Hit
enter
to search or
ESC
to close
May I Suggest ?
#leanote #leanote blog #code #hello world
Mutepig's Blog
Home
Archives
Tags
Search
About Me
vfprint源码分析
无
976
0
0
mut3p1g
由于对于`snprint`是如何做到能用`vararg`来溢出`format`百思不得其解,所以只能分析一波[源码](https://code.woboq.org/userspace/glibc/stdio-common/vfprintf.c.html#160args_value)了。 ## vfprintf ### 0. 说明 #### 1) 参数 `vfprintf`的作用为将某参数列表`arg`的值格式化`format`后输入到某个流`stream`中,其函数定义如下: ``` int vfprintf(FILE *stream, const char *format, va_list arg) ``` 其中format的格式指示串形式为:`%[flags][width][.prec][mod][type]`。 * flags: 标志字符序列 符号|意义 -|- 空格|输出数字的正(空格)负(-) +|输出数字的正(+)负(-) -|输出数字的负(-),同时向左对齐,向右边补pad \#|对%o在开头加上0;对%x在开头加上0x * width: 宽度指示符 符号|意义 -|- 0-9|将对应数值取为width *|取参数中对应的整数作为width * .prec: 精度指示符 符号|意义 -|- .0-9|将对应数值取为prec .*|取参数中对应的整数作为prec * mod :指明后面的整数参数具体类型 符号|意义 -|- h|short l|long ll|long long L|long double * type: 转换指示符 符号|意义 ---|--- c|字符 d\|i|有符号十进制数 e|有符号带e的科学计数法 E|有符号带E的科学计数法 f|十进制浮点数 g|使用%e和%f中更短的一个 G|使用%E和%f中更短的一个 o|有符号八进制数 s|字符串 u|无符号s十进制数 x|无符号十六进制数 X|无符号十六进制数(大写字母) p|以十六进制输出指针 n|%n以前显示的字符数量(*DWORD),同时写入参数指针 hn|%hn以前显示的字符数量(*WORD),同时写入参数指针 \$|%[num]$表示取第num个参数 ### 1. 初始化 #### 1) 变量声明 ``` 1241 /* The function itself. */ 1242 int 1243 vfprintf (FILE *s, const CHAR_T *format, va_list ap) 1244 { 1245 /* The character used as thousands separator. */ 1246 THOUSANDS_SEP_T thousands_sep = 0; 1247 1248 /* The string describing the size of groups of digits. */ 1249 const char *grouping; 1250 1251 /* Place to accumulate the result. */ 1252 int done; 1253 1254 /* Current character in format string. */ 1255 const UCHAR_T *f; // format字符串中当前正在处理的字符 1256 1257 /* End of leading constant string. */ 1258 const UCHAR_T *lead_str_end; // 1259 1260 /* Points to next format specifier. */ 1261 const UCHAR_T *end_of_spec; 1262 1263 /* Buffer intermediate results. */ 1264 CHAR_T work_buffer[WORK_BUFFER_SIZE]; 1265 CHAR_T *workstart = NULL; 1266 CHAR_T *workend; 1267 1268 /* We have to save the original argument pointer. */ 1269 va_list ap_save; 1270 1271 /* Count number of specifiers we already processed. */ 1272 int nspecs_done; 1273 1274 /* For the %m format we may need the current `errno' value. */ 1275 int save_errno = errno; 1276 1277 /* 1 if format is in read-only memory, -1 if it is in writable memory, 1278 0 if unknown. */ 1279 int readonly_format = 0; ``` #### 2) 初始化 一上来先是一堆检查,然后找到`format`字符串中第一个`%`,并把之前的字符串直接写入`stream`。所以看到这里问题就基本解决了,在`snprintf`中我们将`format`里的字符串写入到了对应的字符串中,而这个字符串可以溢出到`format`,之后继续解析的话就能导致格式化字符串了。 ``` 1315 #ifdef COMPILE_WPRINTF 1316 /* Find the first format specifier. */ 1317 f = lead_str_end = __find_specwc ((const UCHAR_T *) format); // 找到format字符串中第一个%,从它截断并返回 1318 #else 1319 /* Find the first format specifier. */ 1320 f = lead_str_end = __find_specmb ((const UCHAR_T *) format); 1321 #endif 1322 1323 /* Lock stream. */ 1324 _IO_cleanup_region_start ((void (*) (void *)) &_IO_funlockfile, s); 1325 _IO_flockfile (s); 1326 1327 /* Write the literal text before the first format. */ 1328 outstring ((const UCHAR_T *) format, 1329 lead_str_end - (const UCHAR_T *) format); // 将第一个%之前的字符串写入到stream中 1330 1331 /* If we only have to print a simple string, return now. */ 1332 if (*f == L_('\0')) // format中没有% 1333 goto all_done; 1334 1335 /* Use the slow path in case any printf handler is registered. */ 1336 if (__glibc_unlikely (__printf_function_table != NULL 1337 || __printf_modifier_table != NULL 1338 || __printf_va_arg_table != NULL)) 1339 goto do_positional; ``` ### 2. format处理 #### 0) 对符号的处理 由于`format`中含有许多不同的符号,而对他们的处理函数也不同的,所以下面看看对不同的符号是如何找到其对应的处理函数。 ##### a. jump_table `jump_table`是一个偏移表,从空格开始,到字母`z`结束。不能处理的话对应值为0,可以处理的字符对应的偏移就如表所示: ``` 212 static const uint8_t jump_table[] = 213 { 214 /* ' ' */ 1, 0, 0, /* '#' */ 4, 215 0, /* '%' */ 14, 0, /* '\''*/ 6, 216 0, 0, /* '*' */ 7, /* '+' */ 2, 217 0, /* '-' */ 3, /* '.' */ 9, 0, 218 /* '0' */ 5, /* '1' */ 8, /* '2' */ 8, /* '3' */ 8, 219 /* '4' */ 8, /* '5' */ 8, /* '6' */ 8, /* '7' */ 8, 220 /* '8' */ 8, /* '9' */ 8, 0, 0, 221 0, 0, 0, 0, 222 0, /* 'A' */ 26, 0, /* 'C' */ 25, 223 0, /* 'E' */ 19, /* F */ 19, /* 'G' */ 19, 224 0, /* 'I' */ 29, 0, 0, 225 /* 'L' */ 12, 0, 0, 0, 226 0, 0, 0, /* 'S' */ 21, 227 0, 0, 0, 0, 228 /* 'X' */ 18, 0, /* 'Z' */ 13, 0, 229 0, 0, 0, 0, 230 0, /* 'a' */ 26, 0, /* 'c' */ 20, 231 /* 'd' */ 15, /* 'e' */ 19, /* 'f' */ 19, /* 'g' */ 19, 232 /* 'h' */ 10, /* 'i' */ 15, /* 'j' */ 28, 0, 233 /* 'l' */ 11, /* 'm' */ 24, /* 'n' */ 23, /* 'o' */ 17, 234 /* 'p' */ 22, /* 'q' */ 12, 0, /* 's' */ 21, 235 /* 't' */ 27, /* 'u' */ 16, 0, 0, 236 /* 'x' */ 18, 0, /* 'z' */ 13 237 }; ``` `CHAR_CLASS`根据给定的字符得到对应的偏移: ``` #define CHAR_CLASS(Ch) (jump_table[(INT_T) (Ch) - L_(' ')]) ``` ##### b. stepX_jumps `stepX_jumps`是一个30长度的`const int`数组,其的`X`是一个0-4的数字,表示第`X`步某符号对应的处理函数的地址。 X|意义 -|- 0|初始化 1|处理完width 2|处理完precision 3a|处理完h 3b|处理完l 4|处理format ##### c. JUMP 这个函数就是让符号跳转到给定的`table`偏移对应的地方。 ``` 247 # define JUMP(ChExpr, table) \ // 得到字符ChExpr对应的table 248 do \ 249 { \ 250 int offset; \ 251 void *ptr; \ 252 spec = (ChExpr); \ 253 offset = NOT_IN_JUMP_RANGE (spec) ? REF (form_unknown) \ 254 : table[CHAR_CLASS (spec)]; \ 255 ptr = &&JUMP_TABLE_BASE_LABEL + offset; \ // 如果符号在允许范围内(空格到'z'之间),则直接返回对应的table 256 goto *ptr; \ 257 } \ 258 while (0) ``` #### 1) 处理format字符 ##### a. 声明变量及初始化 ``` 1341 /* Process whole format string. */ 1342 do 1343 { 1344 STEP0_3_TABLE; // 设定step[0-3]_jumps 1345 STEP4_TABLE; // 设定step4_jumps 1346 1347 union printf_arg *args_value; /* This is not used here but ... */ 1348 int is_negative; /* Flag for negative number. */ // 标记数字是否为负数 1349 union 1350 { 1351 unsigned long long int longlong; 1352 unsigned long int word; 1353 } number; 1354 int base; 1355 union printf_arg the_arg; 1356 CHAR_T *string; /* Pointer to argument string. */ 1357 int alt = 0; /* Alternate format. */ 1358 int space = 0; /* Use space prefix if no sign is needed. */ 1359 int left = 0; /* Left-justify output. */ 1360 int showsign = 0; /* Always begin with plus or minus sign. */ 1361 int group = 0; /* Print numbers according grouping rules. */ 1362 int is_long_double = 0; /* Argument is long double/ long long int. */ 1363 int is_short = 0; /* Argument is short int. */ 1364 int is_long = 0; /* Argument is long int. */ 1365 int is_char = 0; /* Argument is promoted (unsigned) char. */ 1366 int width = 0; /* Width of output; 0 means none specified. */ 1367 int prec = -1; /* Precision of output; -1 means none specified. */ 1368 /* This flag is set by the 'I' modifier and selects the use of the 1369 `outdigits' as determined by the current locale. */ 1370 int use_outdigits = 0; 1371 UCHAR_T pad = L_(' ');/* Padding character. */ // 用于补齐的填充,默认是空格 1372 CHAR_T spec; 1373 1374 workstart = NULL; 1375 workend = work_buffer + WORK_BUFFER_SIZE; ``` ##### b. 符号处理标签定义 标签的定义格式是这样的: ``` LABEL (Name): something to do .. JUMP (*++f, stepX_jumps); // 根据处理的符号,确定下一个要处理f对应X的值 ``` ###### i. flags char | Name | to do | next_X -|-|-|- 空格|flag_space|space = 1;|0 +|flag_plus|showsign = 1;|0 -|flag_minus|left = 1;pad = L_(' ');|0 \#|flag_hash|alt = 1;|0 0|flag_zero|if (!left) pad = L_('0');| 0 I|flag_i18n|use_outdigits = 1;|0 * 单引号 ``` 1407 /* The '\'' flag. */ 1408 LABEL (flag_quote): 1409 group = 1; 1410 1411 if (grouping == (const char *) -1) 1412 { 1413 #ifdef COMPILE_WPRINTF 1414 thousands_sep = _NL_CURRENT_WORD (LC_NUMERIC, 1415 _NL_NUMERIC_THOUSANDS_SEP_WC); 1416 #else 1417 thousands_sep = _NL_CURRENT (LC_NUMERIC, THOUSANDS_SEP); 1418 #endif 1419 1420 grouping = _NL_CURRENT (LC_NUMERIC, GROUPING); 1421 if (*grouping == '\0' || *grouping == CHAR_MAX 1422 #ifdef COMPILE_WPRINTF 1423 || thousands_sep == L'\0' 1424 #else 1425 || *thousands_sep == '\0' 1426 #endif 1427 ) 1428 grouping = NULL; 1429 } 1430 JUMP (*++f, step0_jumps); 1431 ``` ###### ii. width * * ``` 1436 /* Get width from argument. */ 1437 LABEL (width_asterics): 1438 { 1439 const UCHAR_T *tmp; /* Temporary value. */ 1440 1441 tmp = ++f; 1442 if (ISDIGIT (*tmp)) 1443 { 1444 int pos = read_int (&tmp); 1445 1446 if (pos == -1) 1447 { 1448 __set_errno (EOVERFLOW); 1449 done = -1; 1450 goto all_done; 1451 } 1452 1453 if (pos && *tmp == L_('$')) 1454 /* The width comes from a positional parameter. */ 1455 goto do_positional; 1456 } 1457 width = va_arg (ap, int); 1458 1459 /* Negative width means left justified. */ 1460 if (width < 0) 1461 { 1462 width = -width; 1463 pad = L_(' '); 1464 left = 1; 1465 } 1466 1467 if (__glibc_unlikely (width >= INT_MAX / sizeof (CHAR_T) - EXTSIZ)) 1468 { 1469 __set_errno (EOVERFLOW); 1470 done = -1; 1471 goto all_done; 1472 } 1473 1474 if (width >= WORK_BUFFER_SIZE - EXTSIZ) 1475 { 1476 /* We have to use a special buffer. */ 1477 size_t needed = ((size_t) width + EXTSIZ) * sizeof (CHAR_T); 1478 if (__libc_use_alloca (needed)) 1479 workend = (CHAR_T *) alloca (needed) + width + EXTSIZ; 1480 else 1481 { 1482 workstart = (CHAR_T *) malloc (needed); 1483 if (workstart == NULL) 1484 { 1485 done = -1; 1486 goto all_done; 1487 } 1488 workend = workstart + width + EXTSIZ; 1489 } 1490 } 1491 } 1492 JUMP (*f, step1_jumps); ``` ###### b. 1-9 ``` 1494 /* Given width in format string. */ 1495 LABEL (width): 1496 width = read_int (&f); 1497 1498 if (__glibc_unlikely (width == -1 1499 || width >= INT_MAX / sizeof (CHAR_T) - EXTSIZ)) 1500 { 1501 __set_errno (EOVERFLOW); 1502 done = -1; 1503 goto all_done; 1504 } 1505 1506 if (width >= WORK_BUFFER_SIZE - EXTSIZ) 1507 { 1508 /* We have to use a special buffer. */ 1509 size_t needed = ((size_t) width + EXTSIZ) * sizeof (CHAR_T); 1510 if (__libc_use_alloca (needed)) 1511 workend = (CHAR_T *) alloca (needed) + width + EXTSIZ; 1512 else 1513 { 1514 workstart = (CHAR_T *) malloc (needed); 1515 if (workstart == NULL) 1516 { 1517 done = -1; 1518 goto all_done; 1519 } 1520 workend = workstart + width + EXTSIZ; 1521 } 1522 } 1523 if (*f == L_('$')) 1524 /* Oh, oh. The argument comes from a positional parameter. */ 1525 goto do_positional; 1526 JUMP (*f, step1_jumps); ``` ###### iii. precision ``` 1528 LABEL (precision): 1529 ++f; 1530 if (*f == L_('*')) 1531 { 1532 const UCHAR_T *tmp; /* Temporary value. */ 1533 1534 tmp = ++f; 1535 if (ISDIGIT (*tmp)) 1536 { 1537 int pos = read_int (&tmp); 1538 1539 if (pos == -1) 1540 { 1541 __set_errno (EOVERFLOW); 1542 done = -1; 1543 goto all_done; 1544 } 1545 1546 if (pos && *tmp == L_('$')) 1547 /* The precision comes from a positional parameter. */ 1548 goto do_positional; 1549 } 1550 prec = va_arg (ap, int); 1551 1552 /* If the precision is negative the precision is omitted. */ 1553 if (prec < 0) 1554 prec = -1; 1555 } 1556 else if (ISDIGIT (*f)) 1557 { 1558 prec = read_int (&f); 1559 1560 /* The precision was specified in this case as an extremely 1561 large positive value. */ 1562 if (prec == -1) 1563 { 1564 __set_errno (EOVERFLOW); 1565 done = -1; 1566 goto all_done; 1567 } 1568 } 1569 else 1570 prec = 0; 1571 if (prec > width && prec > WORK_BUFFER_SIZE - EXTSIZ) 1572 { 1573 /* Deallocate any previously allocated buffer because it is 1574 too small. */ 1575 if (__glibc_unlikely (workstart != NULL)) 1576 free (workstart); 1577 workstart = NULL; 1578 if (__glibc_unlikely (prec >= INT_MAX / sizeof (CHAR_T) - EXTSIZ)) 1579 { 1580 __set_errno (EOVERFLOW); 1581 done = -1; 1582 goto all_done; 1583 } 1584 size_t needed = ((size_t) prec + EXTSIZ) * sizeof (CHAR_T); 1585 1586 if (__libc_use_alloca (needed)) 1587 workend = (CHAR_T *) alloca (needed) + prec + EXTSIZ; 1588 else 1589 { 1590 workstart = (CHAR_T *) malloc (needed); 1591 if (workstart == NULL) 1592 { 1593 done = -1; 1594 goto all_done; 1595 } 1596 workend = workstart + prec + EXTSIZ; 1597 } 1598 } 1599 JUMP (*f, step2_jumps); ``` ###### iv. h|l|L char | Name | to do | next_X -|-|-|- h|mod_half|is_short = 1;|3a hh|mod_halfhalf|is_short = 0; is_char = 1;|4 l|mod_long|is_long = 1;|3b ll\|L\|q | mod_longlong|is_long_double = 1;is_long = 1;|4 z\|Z | mod_size_t|is_long_double = sizeof (size_t) > sizeof (unsigned long int);is_long = sizeof (size_t) > sizeof (unsigned int); | 4 t|mod_ptrdiff_t|is_long_double = sizeof (ptrdiff_t) > sizeof (unsigned long int);is_long = sizeof (ptrdiff_t) > sizeof (unsigned int);|4 j|mod_intmax_t|is_long_double = sizeof (intmax_t) > sizeof (unsigned long int);is_long = sizeof (intmax_t) > sizeof (unsigned int);|4 ##### c. 处理转换指示符 处理的过程大致是在下面这个循环中 ``` 1640 while (1) 1641 { 1642 process_arg (((struct printf_spec *) NULL)); 1643 process_string_arg (((struct printf_spec *) NULL)); 1644 1645 LABEL (form_unknown): 1646 if (spec == L_('\0')) 1647 { 1648 /* The format string ended before the specifier is complete. */ 1649 __set_errno (EINVAL); 1650 done = -1; 1651 goto all_done; 1652 } 1653 1654 /* If we are in the fast loop force entering the complicated 1655 one. */ 1656 goto do_positional; 1657 } ``` 之后跳转到`do_positional` ``` 1682 do_positional: 1683 if (__glibc_unlikely (workstart != NULL)) 1684 { 1685 free (workstart); 1686 workstart = NULL; 1687 } 1688 done = printf_positional (s, format, readonly_format, ap, &ap_save, 1689 done, nspecs_done, lead_str_end, work_buffer, 1690 save_errno, grouping, thousands_sep); ``` 所以还是最后关键在`printf_positional`: ``` // nspecs_done=>第nspecs_done个转换指示符 // specs=> 2022 process_arg ((&specs[nspecs_done])); 2023 process_string_arg ((&specs[nspecs_done])); ``` 转换指示符实在太多了,所以这里我就对在格式化字符串中最重要的`%n`进行一下分析,通过查表可以发现其对应的标签是`form_number` ``` 897 LABEL (form_number): \ 898 if (s->_flags2 & _IO_FLAGS2_FORTIFY) \ 899 { \ 900 if (! readonly_format) \ 901 { \ 902 extern int __readonly_area (const void *, size_t) \ 903 attribute_hidden; \ 904 readonly_format \ 905 = __readonly_area (format, ((STR_LEN (format) + 1) \ 906 * sizeof (CHAR_T))); \ 907 } \ 908 if (readonly_format < 0) \ 909 __libc_fatal ("*** %n in writable segment detected ***\n"); \ 910 } \ 911 /* Answer the count of characters written. */ \ 912 if (fspec == NULL) \ 913 { \ 914 if (is_longlong) \ 915 *(long long int *) va_arg (ap, void *) = done; \ 916 else if (is_long_num) \ 917 *(long int *) va_arg (ap, void *) = done; \ 918 else if (is_char) \ 919 *(char *) va_arg (ap, void *) = done; \ 920 else if (!is_short) \ 921 *(int *) va_arg (ap, void *) = done; \ 922 else \ 923 *(short int *) va_arg (ap, void *) = done; \ 924 } \ 925 else \ 926 if (is_longlong) \ 927 *(long long int *) args_value[fspec->data_arg].pa_pointer = done; \ 928 else if (is_long_num) \ 929 *(long int *) args_value[fspec->data_arg].pa_pointer = done; \ 930 else if (is_char) \ 931 *(char *) args_value[fspec->data_arg].pa_pointer = done; \ 932 else if (!is_short) \ 933 *(int *) args_value[fspec->data_arg].pa_pointer = done; \ 934 else \ 935 *(short int *) args_value[fspec->data_arg].pa_pointer = done; \ 936 break; ```
觉得不错,点个赞?
提交评论
Sign in
to leave a comment.
No Leanote account ?
Sign up now
.
0
条评论
More...
文章目录
No Leanote account ? Sign up now.