Ever needed to get the name of a class from a file without actually loading that file in PHP? Probably not – I’ll admit it’s kind of a super edgy edge case. Nonetheless, it’s an interesting problem, and I ran into it working on some migration scripts for a project. So, for the sake of my own memory, I thought I would put together this quick little write-up. Hopefully it helps someone else out as well.
The Problem
So while writing a migration script I found that I needed to get the names of all classes contained in all the files within my migrations directory. I wanted just the list of class names so I could check which migrations had been run and which remained to be run, and then only run the ones that were needed.
The Not-a-Good-Enough-Solution Solution
At first I thought of using get_declared_classes()
because it was an obvious way to get a list of class names. Unfortunately it required including or requiring all the files and then checking which classes are declared. This approach is fraught with issues in my mind though, not least of which is the fact that every class would, well, get declared, and I do not want that overhead (minuscule as it would likely be). Just in case it does work for you, and you do not want to read any further, here is my untested thought on how you would go about using the list of declared classes:
1 2 3 4 5 6 7 8 9 10 |
<?php //Store the list of classes that are currently loaded $classes = get_declared_classes(); //Include the file we want to get the class name for include 'path/to/file.php'; //Use array_diff() to find what class(es) exists now that did not before $class = array_diff(get_declared_classes(), $classes); |
The Better, More Extendable Solution
So putting aside the get_declared_classes()
option, we needed another way to get the class names, preferably without actually declaring the classes. To do this we can use PHP’s tokenizer to help convert the contents of the file into its basic language tokens. Once we have the tokens, we can iterate through them and pull out the information that is important to us. In this case, the only thing we care about is finding the textual parts of the namespace declaration, and the textual parts of the class declaration. Here’s a quick script that should do the job:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
<?php function get_class_from_file($path_to_file) { //Grab the contents of the file $contents = file_get_contents($path_to_file); //Start with a blank namespace and class $namespace = $class = ""; //Set helper values to know that we have found the namespace/class token and need to collect the string values after them $getting_namespace = $getting_class = false; //Go through each token and evaluate it as necessary foreach (token_get_all($contents) as $token) { //If this token is the namespace declaring, then flag that the next tokens will be the namespace name if (is_array($token) && $token[0] == T_NAMESPACE) { $getting_namespace = true; } //If this token is the class declaring, then flag that the next tokens will be the class name if (is_array($token) && $token[0] == T_CLASS) { $getting_class = true; } //While we're grabbing the namespace name... if ($getting_namespace === true) { //If the token is a string or the namespace separator... if(is_array($token) && in_array($token[0], [T_STRING, T_NS_SEPARATOR])) { //Append the token's value to the name of the namespace $namespace .= $token[1]; } else if ($token === ';') { //If the token is the semicolon, then we're done with the namespace declaration $getting_namespace = false; } } //While we're grabbing the class name... if ($getting_class === true) { //If the token is a string, it's the name of the class if(is_array($token) && $token[0] == T_STRING) { //Store the token's value as the class name $class = $token[1]; //Got what we need, stope here break; } } } //Build the fully-qualified class name and return it return $namespace ? $namespace . '\\' . $class : $class; } |
Why The Tokenizer Is Better
Better is a bit of a loaded term, and some may argue that this is more work than the get_declared_classes()
option. My reasoning for calling this better rests in the fact that the script above is far more extendable than the get_declared_classes()
option. In the future if I find I want the list of methods of the class, or want to get some docblock annotations from the file, I already have the means for doing this, and just need to make the required updates. So, merely for the simple fact that the above script provides a skeleton upon which the yet-unknown future needs of my migration script may be draped, I will for now consider this the better option.
Other Considerations
As with all development solutions, there are extra caveats:
- The script above defines a global function, which is not very object-oriented. This is just for illustration purposes, and it can easily be made into a method of a class (which is actually how I have it implemented in my migration class).
- There is not a lot…or any…error checking happening in the script above. Consider adding in error catching before you make this a part of a system that might live in a production environment.
- The script above expects that it will find only a single class in the file, so it stops after the first one. If you’re being good and following PSR-4 then this should be the case. If for some reason there is more than one class possible in the file, then this script would require some changes to handle that scenario.
Szabolcs Palmer
Hi Jarret,
I know that this is an old post, just wanted to flag that in the get_class_from_file code block you have a typo in line 61 – the second question mark should be a colon.
Cheers,
Szasza
jbyrne
Good catch, thanks!
Alejandro
Hey, thanks for the inspiration, I needed something like this.
I loved the Tokenized version, as the first approach wouldn’t work if the Class was previously loaded by some other mechanism.
However, I ran into one problem, files that were not true classes, say a
trait
but happened to have code like$this->someMethod(static::class, '_some_value_')
were mistakenly assuming that the file’s class wasNAMESPACE/_some_value_
.So, I came with a different approach to share with you, using RegEx
private function _getClassFromFile($pathToFile)
{
$namespaceRegex = ‘/namespace[ ]+([\w\d_\\\\]+);/’;
$namespaceMatchList = [];
$namespaceIdx = 1;
$classNameRegex = ‘/(abstract +)?class[ ]+([\w\d_]+)[ ]*(extends|implements|{)?/’;
$classNameMatchList = [];
$classAbstractIdx = 1;
$classNameIdx = 2;
$contents = file_get_contents($pathToFile);
$isNamespace = preg_match($namespaceRegex, $contents, $namespaceMatchList) == 1;
$isClassName = preg_match($classNameRegex, $contents, $classNameMatchList) == 1;
// If a class name was not found, this file does not represent a Class file, or the class
// is an abstract Class and thus cannot be instantiated
if (!$isClassName || !empty($classNameMatchList[$classAbstractIdx])) {
return false;
}
$namespace = $isNamespace? $namespaceMatchList[$namespaceIdx] : ”;
$className = $classNameMatchList[$classNameIdx];
$qualifiedName = ltrim($namespace, ‘\\’) . ‘\\’ . $className;
return $qualifiedName;
}
jbyrne
Yeah, sounds like your needs were a little more complex than mine – I knew the general structure of every file I was inspecting and could toss out the need to evaluation most cases, like traits and abstracts.
Regex is a powerful tool, and I use it often, but generally as a last resort. In this case I think it’s highly successful because we’re talking about keywords in a language that have a very specific structure. Chances are good that before a regex failed in this instance, the file would have to contain failing, error-throwing PHP code.
Thanks for providing your alternative implementation.
David Egan
This came in very handy today – just wanted to say thanks!
I’ve not used parser tokens before so this post was quite eye-opening.
jbyrne
Hey there David, thanks for the note. I’m glad this post proved valuable for you.
Carter Pape
This is great. Just copied and pasted to my own class and reformatting to match my code style. Let me know if you have licensing or credit requirements or requests.