-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Replace mb_split with preg_split in Str helper #58015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We shouldn't build our own solution when there's an extension explicitly taking care of multi-byte strings :/ |
Good point! But if we can achieve the same result using core functions instead of php extensions wouldn't it be better in the long run? |
|
PHP doesn't have built-in multi-byte support, so no. We have to rely on an extension. And since Laravel wants |
|
Thanks for the feedback! I understand the concern about relying on extensions, but I'd like to clarify a few technical points regarding "built-in" support and consistency:
Given that the framework already trusts Would you be open to reconsidering this to make the framework more robust across different PHP builds? |
|
Thanks for your pull request to Laravel! Unfortunately, I'm going to delay merging this code for now. To preserve our ability to adequately maintain the framework, we need to be very careful regarding the amount of code we include. If applicable, please consider releasing your code as a package so that the community can still take advantage of your contributions! |
I never said that it isn't core, I said, it is not primarily(!) meant for multi-byte string handling and therefore should not expected to be the most bullet-proof solution. And even if it has fairly good multi-byte support, it terms of consistency (I know, point 3 on your list, but you only refer to one multi-byte use case, I rather mean the entirety of them), we should try to depend on one "tool" (multi-byte-first support extensions) rather than a patchwork rug of a multitude of different solutions for every different use case.
Good point. Since Taylor closed this PR, maybe we should add this dependency to make sure it is installed. There might be cases where there is no
The framework is not perfect. People constantly argue that because someone added something ten years back (I'm exaggerating here) and nobody noticed it to change it, it is the way to go for all eternity. What people have done, however, is gradually improving multi-byte support in the helper. But there's still room for even more improvement. That this takes time is the reality of open-source community-driven software. |
Description
This PR replaces usages of
mb_splitwithpreg_splitin theIlluminate\Support\Strclass.Motivation
The
mb_splitfunction is part of thembstringextension's regex module (mbregex). Whilembstringis a requirement for Laravel, it is possible to compilembstringwithoutmbregexsupport (using--disable-mbregex). In such environments, or in certain PHP runtime configurations (e.g., specific MCP server environments),mb_splitmay be undefined even ifmbstringis loaded.preg_splitis part of the PCRE extension, which is a core requirement of PHP and is always available. Usingpreg_splitwith theu(Unicode) modifier achieves the same result asmb_splitfor splitting strings by whitespace, making the framework more robust across different PHP builds.Changes
mb_split('\s+', $value)withpreg_split('/\s+/u', $value)in:Str::headline()Str::apa()Str::studly()Verification
I have verified that
preg_split('/\s+/u', ...)correctly splits strings containing Unicode characters by whitespace, identical tomb_split('\s+', ...).