{"id":65280,"date":"2026-03-26T08:35:27","date_gmt":"2026-03-26T01:35:27","guid":{"rendered":"https:\/\/hbbgroup.net\/google-shrinks-ai-memory-with-no-accuracy-loss-but-theres-a-catch\/"},"modified":"2026-03-26T11:00:08","modified_gmt":"2026-03-26T04:00:08","slug":"google-shrinks-ai-memory-with-no-accuracy-loss-but-theres-a-catch","status":"publish","type":"post","link":"https:\/\/hbbgroup.net\/vi\/google-shrinks-ai-memory-with-no-accuracy-loss-but-theres-a-catch\/","title":{"rendered":"Google thu nh\u1ecf dung l\u01b0\u1ee3ng b\u1ed9 nh\u1edb c\u1ee7a AI m\u00e0 kh\u00f4ng l\u00e0m gi\u1ea3m \u0111\u1ed9 ch\u00ednh x\u00e1c \u2014 nh\u01b0ng c\u00f3 m\u1ed9t \u0111i\u1ec3m \u0111\u00e1nh \u0111\u1ed5i."},"content":{"rendered":"<div>\n<p><strong>T\u00f3m t\u1eaft nhanh<\/strong><\/p>\n<ul>\n<li>Google cho bi\u1ebft thu\u1eadt to\u00e1n TurboQuant c\u1ee7a h\u1ecd c\u00f3 th\u1ec3 c\u1eaft gi\u1ea3m \u00edt nh\u1ea5t 6 l\u1ea7n m\u1ed9t n\u00fat th\u1eaft l\u1edbn v\u1ec1 b\u1ed9 nh\u1edb c\u1ee7a AI m\u00e0 kh\u00f4ng l\u00e0m suy gi\u1ea3m \u0111\u1ed9 ch\u00ednh x\u00e1c trong qu\u00e1 tr\u00ecnh suy lu\u1eadn.<\/li>\n<li>C\u1ed5 phi\u1ebfu nh\u00f3m c\u00f4ng ty b\u1ed9 nh\u1edb nh\u01b0 Micron, Western Digital v\u00e0 Seagate \u0111\u00e3 gi\u1ea3m sau khi b\u00e0i nghi\u00ean c\u1ee9u n\u00e0y \u0111\u01b0\u1ee3c lan truy\u1ec1n.<\/li>\n<li>Ph\u01b0\u01a1ng ph\u00e1p n\u00e0y n\u00e9n b\u1ed9 nh\u1edb ph\u1ee5c v\u1ee5 suy lu\u1eadn, ch\u1ee9 kh\u00f4ng n\u00e9n tr\u1ecdng s\u1ed1 m\u00f4 h\u00ecnh, v\u00e0 cho \u0111\u1ebfn nay m\u1edbi ch\u1ec9 \u0111\u01b0\u1ee3c ki\u1ec3m ch\u1ee9ng tr\u00ean c\u00e1c benchmark nghi\u00ean c\u1ee9u.<\/li>\n<\/ul>\n<p>Google Research \u0111\u00e3 c\u00f4ng b\u1ed1 TurboQuant v\u00e0o th\u1ee9 T\u01b0, m\u1ed9t thu\u1eadt to\u00e1n n\u00e9n gi\u00fap thu nh\u1ecf m\u1ed9t n\u00fat th\u1eaft l\u1edbn v\u1ec1 b\u1ed9 nh\u1edb trong giai \u0111o\u1ea1n suy lu\u1eadn \u00edt nh\u1ea5t 6 l\u1ea7n, trong khi v\u1eabn duy tr\u00ec m\u1ee9c \u0111\u1ed9 ch\u00ednh x\u00e1c kh\u00f4ng suy gi\u1ea3m.<\/p>\n<p>B\u00e0i nghi\u00ean c\u1ee9u n\u00e0y d\u1ef1 ki\u1ebfn s\u1ebd \u0111\u01b0\u1ee3c tr\u00ecnh b\u00e0y t\u1ea1i ICLR 2026, v\u00e0 ph\u1ea3n \u1ee9ng tr\u00ean m\u1ea1ng xu\u1ea5t hi\u1ec7n g\u1ea7n nh\u01b0 ngay l\u1eadp t\u1ee9c.<\/p>\n<p>CEO Cloudflare Matthew Prince g\u1ecdi \u0111\u00e2y l\u00e0 \u201ckho\u1ea3nh kh\u1eafc DeepSeek\u201d c\u1ee7a Google. C\u00f9ng trong ng\u00e0y, gi\u00e1 c\u1ed5 phi\u1ebfu c\u1ee7a c\u00e1c c\u00f4ng ty b\u1ed9 nh\u1edb nh\u01b0 Micron, Western Digital v\u00e0 Seagate \u0111\u00e3 \u0111\u1ed3ng lo\u1ea1t gi\u1ea3m.<\/p>\n<p>V\u1eady \u0111i\u1ec1u n\u00e0y c\u00f3 th\u1eadt kh\u00f4ng?<\/p>\n<p>B\u1ea3n th\u00e2n hi\u1ec7u qu\u1ea3 l\u01b0\u1ee3ng t\u1eed h\u00f3a \u0111\u00e3 l\u00e0 m\u1ed9t th\u00e0nh t\u1ef1u l\u1edbn. Nh\u01b0ng tuy\u00ean b\u1ed1 \u201ckh\u00f4ng m\u1ea5t \u0111\u1ed9 ch\u00ednh x\u00e1c\u201d c\u1ea7n \u0111\u01b0\u1ee3c \u0111\u1eb7t trong \u0111\u00fang ng\u1eef c\u1ea3nh.<\/p>\n<p>TurboQuant nh\u1eafm v\u00e0o KV cache \u2014 ph\u1ea7n b\u1ed9 nh\u1edb GPU d\u00f9ng \u0111\u1ec3 l\u01b0u m\u1ecdi th\u1ee9 m\u00e0 m\u1ed9t m\u00f4 h\u00ecnh ng\u00f4n ng\u1eef c\u1ea7n \u201cghi nh\u1edb\u201d trong su\u1ed1t m\u1ed9t cu\u1ed9c h\u1ed9i tho\u1ea1i.<\/p>\n<p>Khi c\u1eeda s\u1ed5 ng\u1eef c\u1ea3nh m\u1edf r\u1ed9ng l\u00ean h\u00e0ng tri\u1ec7u token, c\u00e1c b\u1ed9 nh\u1edb \u0111\u1ec7m n\u00e0y c\u00f3 th\u1ec3 ph\u00ecnh to l\u00ean \u0111\u1ebfn h\u00e0ng tr\u0103m gigabyte cho m\u1ed7i phi\u00ean l\u00e0m vi\u1ec7c. \u0110\u00f3 m\u1edbi l\u00e0 n\u00fat th\u1eaft th\u1ef1c s\u1ef1 \u2014 kh\u00f4ng ph\u1ea3i n\u0103ng l\u1ef1c t\u00ednh to\u00e1n, m\u00e0 l\u00e0 dung l\u01b0\u1ee3ng b\u1ed9 nh\u1edb th\u00f4.<\/p>\n<p>C\u00e1c ph\u01b0\u01a1ng ph\u00e1p n\u00e9n truy\u1ec1n th\u1ed1ng th\u01b0\u1eddng c\u1ed1 thu nh\u1ecf c\u00e1c cache n\u00e0y b\u1eb1ng c\u00e1ch l\u00e0m tr\u00f2n d\u1eef li\u1ec7u xu\u1ed1ng \u2014 ch\u1eb3ng h\u1ea1n t\u1eeb s\u1ed1 th\u1ef1c d\u1ea5u ch\u1ea5m \u0111\u1ed9ng 32-bit xu\u1ed1ng 16-bit, r\u1ed3i 8-bit, r\u1ed3i s\u1ed1 nguy\u00ean 4-bit. \u0110\u1ec3 d\u1ec5 h\u00ecnh dung, c\u00f3 th\u1ec3 xem n\u00f3 gi\u1ed1ng nh\u01b0 vi\u1ec7c gi\u1ea3m \u0111\u1ed9 ph\u00e2n gi\u1ea3i h\u00ecnh \u1ea3nh t\u1eeb 4K xu\u1ed1ng Full HD r\u1ed3i xu\u1ed1ng 720p. B\u1ea1n v\u1eabn nh\u1eadn ra \u0111\u00f3 l\u00e0 c\u00f9ng m\u1ed9t b\u1ee9c \u1ea3nh, nh\u01b0ng m\u1ee9c \u0111\u1ed9 chi ti\u1ebft \u1edf 4K r\u00f5 r\u00e0ng cao h\u01a1n.<\/p>\n<p>\u0110i\u1ec3m \u0111\u00e1nh \u0111\u1ed5i n\u1eb1m \u1edf ch\u1ed7: c\u00e1c ph\u01b0\u01a1ng ph\u00e1p \u0111\u00f3 ph\u1ea3i l\u01b0u th\u00eam c\u00e1c \u201ch\u1eb1ng s\u1ed1 l\u01b0\u1ee3ng t\u1eed h\u00f3a\u201d b\u00ean c\u1ea1nh d\u1eef li\u1ec7u \u0111\u00e3 n\u00e9n \u0111\u1ec3 tr\u00e1nh l\u00e0m m\u00f4 h\u00ecnh suy gi\u1ea3m ch\u1ea5t l\u01b0\u1ee3ng qu\u00e1 m\u1ea1nh. Nh\u1eefng h\u1eb1ng s\u1ed1 n\u00e0y l\u00e0m ph\u00e1t sinh th\u00eam 1 \u0111\u1ebfn 2 bit cho m\u1ed7i gi\u00e1 tr\u1ecb, khi\u1ebfn l\u1ee3i \u00edch n\u00e9n b\u1ecb b\u00e0o m\u00f2n m\u1ed9t ph\u1ea7n.<\/p>\n<p>TurboQuant tuy\u00ean b\u1ed1 lo\u1ea1i b\u1ecf ho\u00e0n to\u00e0n ph\u1ea7n overhead \u0111\u00f3.<\/p>\n<p>Google th\u1ef1c hi\u1ec7n \u0111i\u1ec1u n\u00e0y th\u00f4ng qua hai thu\u1eadt to\u00e1n con. PolarQuant t\u00e1ch \u0111\u1ed9 l\u1edbn ra kh\u1ecfi h\u01b0\u1edbng trong c\u00e1c vector, c\u00f2n QJL (Quantized Johnson-Lindenstrauss) l\u1ea5y ph\u1ea7n sai s\u1ed1 d\u01b0 r\u1ea5t nh\u1ecf c\u00f2n l\u1ea1i v\u00e0 r\u00fat g\u1ecdn n\u00f3 xu\u1ed1ng ch\u1ec9 c\u00f2n m\u1ed9t bit d\u1ea5u \u2014 d\u01b0\u01a1ng ho\u1eb7c \u00e2m \u2014 m\u00e0 kh\u00f4ng c\u1ea7n l\u01b0u th\u00eam b\u1ea5t k\u1ef3 h\u1eb1ng s\u1ed1 n\u00e0o.<\/p>\n<p>K\u1ebft qu\u1ea3, theo Google, l\u00e0 m\u1ed9t b\u1ed9 \u01b0\u1edbc l\u01b0\u1ee3ng kh\u00f4ng ch\u1ec7ch v\u1ec1 m\u1eb7t to\u00e1n h\u1ecdc cho c\u00e1c ph\u00e9p t\u00ednh attention \u2014 c\u01a1 ch\u1ebf c\u1ed1t l\u00f5i v\u1eadn h\u00e0nh c\u00e1c m\u00f4 h\u00ecnh transformer.<\/p>\n<p>Trong c\u00e1c benchmark s\u1eed d\u1ee5ng Gemma v\u00e0 Mistral, TurboQuant cho k\u1ebft qu\u1ea3 t\u01b0\u01a1ng \u0111\u01b0\u01a1ng v\u1edbi \u0111\u1ed9 ch\u00ednh x\u00e1c \u0111\u1ea7y \u0111\u1ee7 ngay c\u1ea3 khi n\u00e9n 4 l\u1ea7n, bao g\u1ed3m c\u1ea3 vi\u1ec7c gi\u1eef \u0111\u01b0\u1ee3c \u0111\u1ed9 ch\u00ednh x\u00e1c truy xu\u1ea5t ho\u00e0n h\u1ea3o trong c\u00e1c b\u00e0i ki\u1ec3m tra \u201cneedle-in-haystack\u201d v\u1edbi ng\u1eef c\u1ea3nh d\u00e0i t\u1edbi 104.000 token.<\/p>\n<p>\u0110\u1ec3 hi\u1ec3u v\u00ec sao c\u00e1c benchmark n\u00e0y quan tr\u1ecdng: vi\u1ec7c m\u1edf r\u1ed9ng ng\u1eef c\u1ea3nh h\u1eefu d\u1ee5ng c\u1ee7a m\u00f4 h\u00ecnh m\u00e0 kh\u00f4ng l\u00e0m gi\u1ea3m ch\u1ea5t l\u01b0\u1ee3ng t\u1eeb l\u00e2u \u0111\u00e3 l\u00e0 m\u1ed9t trong nh\u1eefng b\u00e0i to\u00e1n kh\u00f3 nh\u1ea5t trong tri\u1ec3n khai LLM.<\/p>\n<\/div>\n<div>\n<figure><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/03\/Quantization-2.width-1250.png@webp\" alt=\"\" width=\"1250\" height=\"561\" data-nimg=\"1\" \/><\/figure>\n<\/div>\n<p>C\u1ee5m t\u1eeb \u201ckh\u00f4ng m\u1ea5t \u0111\u1ed9 ch\u00ednh x\u00e1c\u201d \u1edf \u0111\u00e2y ch\u1ec9 \u00e1p d\u1ee5ng cho vi\u1ec7c n\u00e9n KV cache trong qu\u00e1 tr\u00ecnh suy lu\u1eadn \u2014 ch\u1ee9 kh\u00f4ng \u00e1p d\u1ee5ng cho tr\u1ecdng s\u1ed1 c\u1ee7a m\u00f4 h\u00ecnh. N\u00e9n tr\u1ecdng s\u1ed1 l\u00e0 m\u1ed9t b\u00e0i to\u00e1n ho\u00e0n to\u00e0n kh\u00e1c v\u00e0 kh\u00f3 h\u01a1n nhi\u1ec1u. TurboQuant kh\u00f4ng can thi\u1ec7p v\u00e0o ph\u1ea7n \u0111\u00f3.<\/p>\n<p>Th\u1ee9 m\u00e0 TurboQuant n\u00e9n l\u00e0 v\u00f9ng b\u1ed9 nh\u1edb t\u1ea1m th\u1eddi d\u00f9ng \u0111\u1ec3 l\u01b0u c\u00e1c ph\u00e9p t\u00ednh attention gi\u1eefa ch\u1eebng trong m\u1ed9t phi\u00ean suy lu\u1eadn. Ph\u1ea7n d\u1eef li\u1ec7u n\u00e0y d\u1ec5 \u201cch\u1ecbu n\u00e9n\u201d h\u01a1n, v\u00ec v\u1ec1 m\u1eb7t l\u00fd thuy\u1ebft n\u00f3 c\u00f3 th\u1ec3 \u0111\u01b0\u1ee3c t\u00e1i d\u1ef1ng l\u1ea1i.<\/p>\n<p>Ngo\u00e0i ra c\u00f2n c\u00f3 kho\u1ea3ng c\u00e1ch gi\u1eefa m\u1ed9t benchmark s\u1ea1ch trong m\u00f4i tr\u01b0\u1eddng nghi\u00ean c\u1ee9u v\u00e0 m\u1ed9t h\u1ec7 th\u1ed1ng production ph\u1ea3i ph\u1ee5c v\u1ee5 h\u00e0ng t\u1ef7 request. TurboQuant \u0111\u01b0\u1ee3c th\u1eed nghi\u1ec7m tr\u00ean c\u00e1c m\u00f4 h\u00ecnh m\u00e3 ngu\u1ed3n m\u1edf \u2014 nh\u01b0 Gemma, Mistral v\u00e0 Llama \u2014 ch\u1ee9 ch\u01b0a ph\u1ea3i tr\u00ean h\u1ea1 t\u1ea7ng Gemini c\u1ee7a ch\u00ednh Google \u1edf quy m\u00f4 th\u1ef1c t\u1ebf.<\/p>\n<p>Kh\u00e1c v\u1edbi c\u00e1c b\u01b0\u1edbc ti\u1ebfn v\u1ec1 hi\u1ec7u qu\u1ea3 c\u1ee7a DeepSeek \u2014 v\u1ed1n \u0111\u00f2i h\u1ecfi nh\u1eefng quy\u1ebft \u0111\u1ecbnh ki\u1ebfn tr\u00fac s\u00e2u \u0111\u01b0\u1ee3c t\u00edch h\u1ee3p ngay t\u1eeb \u0111\u1ea7u \u2014 TurboQuant kh\u00f4ng c\u1ea7n retrain hay fine-tune l\u1ea1i m\u00f4 h\u00ecnh, \u0111\u1ed3ng th\u1eddi \u0111\u01b0\u1ee3c cho l\u00e0 g\u1ea7n nh\u01b0 kh\u00f4ng t\u1ea1o th\u00eam chi ph\u00ed runtime \u0111\u00e1ng k\u1ec3. V\u1ec1 l\u00fd thuy\u1ebft, n\u00f3 c\u00f3 th\u1ec3 \u0111\u01b0\u1ee3c t\u00edch h\u1ee3p tr\u1ef1c ti\u1ebfp v\u00e0o c\u00e1c pipeline suy lu\u1eadn hi\u1ec7n c\u00f3.<\/p>\n<p>Ch\u00ednh \u0111i\u1ec3m n\u00e0y \u0111\u00e3 khi\u1ebfn nh\u00f3m c\u1ed5 phi\u1ebfu ph\u1ea7n c\u1ee9ng b\u1ed9 nh\u1edb ph\u1ea3n \u1ee9ng m\u1ea1nh \u2014 b\u1edfi n\u1ebfu c\u00f4ng ngh\u1ec7 n\u00e0y ho\u1ea1t \u0111\u1ed9ng t\u1ed1t trong m\u00f4i tr\u01b0\u1eddng production, c\u00e1c ph\u00f2ng lab AI l\u1edbn c\u00f3 th\u1ec3 v\u1eadn h\u00e0nh ti\u1ebft ki\u1ec7m h\u01a1n tr\u00ean ch\u00ednh s\u1ed1 GPU m\u00e0 h\u1ecd \u0111\u00e3 s\u1edf h\u1eefu.<\/p>\n<p>B\u00e0i nghi\u00ean c\u1ee9u n\u00e0y s\u1ebd \u0111\u01b0\u1ee3c tr\u00ecnh b\u00e0y t\u1ea1i ICLR 2026. Cho \u0111\u1ebfn khi n\u00f3 th\u1ef1c s\u1ef1 \u0111\u01b0\u1ee3c tri\u1ec3n khai trong m\u00f4i tr\u01b0\u1eddng production, tuy\u00ean b\u1ed1 \u201ckh\u00f4ng m\u1ea5t \u0111\u1ed9 ch\u00ednh x\u00e1c\u201d v\u1eabn m\u1edbi ch\u1ec9 d\u1eebng l\u1ea1i trong ph\u00f2ng th\u00ed nghi\u1ec7m.<\/p>\n<p><strong>B\u1ea3n tin Daily Debrief<\/strong><\/p>\n<p>B\u1eaft \u0111\u1ea7u m\u1ed7i ng\u00e0y v\u1edbi nh\u1eefng tin t\u1ee9c n\u1ed5i b\u1eadt nh\u1ea5t hi\u1ec7n t\u1ea1i, c\u00f9ng c\u00e1c b\u00e0i vi\u1ebft g\u1ed1c, podcast v\u00e0 nhi\u1ec1u n\u1ed9i dung kh\u00e1c.<\/p>","protected":false},"excerpt":{"rendered":"<p>T\u00f3m t\u1eaft nhanh Google cho bi\u1ebft thu\u1eadt to\u00e1n TurboQuant c\u1ee7a h\u1ecd c\u00f3 th\u1ec3 c\u1eaft gi\u1ea3m \u00edt nh\u1ea5t 6 l\u1ea7n m\u1ed9t [&hellip;]<\/p>","protected":false},"author":5,"featured_media":65281,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[220],"tags":[],"class_list":["post-65280","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tien-dien-tu"],"acf":[],"_links":{"self":[{"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/posts\/65280","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/comments?post=65280"}],"version-history":[{"count":1,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/posts\/65280\/revisions"}],"predecessor-version":[{"id":65714,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/posts\/65280\/revisions\/65714"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/media\/65281"}],"wp:attachment":[{"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/media?parent=65280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/categories?post=65280"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hbbgroup.net\/vi\/wp-json\/wp\/v2\/tags?post=65280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}