按任意数量的子字符串对字符串数组进行排序

如何解决按任意数量的子字符串对字符串数组进行排序

我正在尝试在php-cs-fixer中修改the OrderedImportsFixer class，以便可以按自己的方式清理文件。我想要的是以与在文件系统列表中看到的方式相似的方式订购导入，在“文件”之前列出“目录”。

因此，给定此数组：

$indexes = [
    26 => ["namespace" => "X\\Y\\Zed"],9 =>  ["namespace" => "A\\B\\See"],3 =>  ["namespace" => "A\\B\\Bee"],38 => ["namespace" => "A\\B\\C\\Dee"],51 => ["namespace" => "X\\Wye"],16 => ["namespace" => "A\\Sea"],12 => ["namespace" => "A\\Bees"],31 => ["namespace" => "M"],];

我想要这个输出：

$sorted = [
    38 => ["namespace" => "A\\B\\C\\Dee"],26 => ["namespace" => "X\\Y\\Zed"],];

与典型的文件系统列表中一样：

我已经去uasort了一段时间（必须保持密钥关联），并且已经接近了。诚然，这是由于绝望的结果而不是任何严格的方法。不太了解uasort的工作方式在这里限制了我。

// get the maximum number of namespace components in the list
$ns_counts = array_map(function($val){
    return count(explode("\\",$val["namespace"]));
},$indexes);
$limit = max($ns_counts);

for ($depth = 0; $depth <= $limit; $depth++) {
    uasort($indexes,function($first,$second) use ($depth,$limit) {
        $fexp = explode("\\",$first["namespace"]);
        $sexp = explode("\\",$second["namespace"]);
        if ($depth === $limit) {
            // why does this help?
            array_pop($fexp);
            array_pop($sexp);
        }
        $fexp = array_slice($fexp,$depth + 1,true);
        $sexp = array_slice($sexp,true);
        $fimp = implode(" ",$fexp);
        $simp = implode(" ",$sexp);
        //echo "$depth: $fimp <-> $simp\n";
        return strnatcmp($fimp,$simp);
    });
}
echo json_encode($indexes,JSON_PRETTY_PRINT);

这给了我正确排序的输出，但是在底部而不是顶部有更深的名称空间：

{
    "31": {
        "namespace": "M"
    },"12": {
        "namespace": "A\\Bees"
    },"16": {
        "namespace": "A\\Sea"
    },"3": {
        "namespace": "A\\B\\Bee"
    },"9": {
        "namespace": "A\\B\\See"
    },"38": {
        "namespace": "A\\B\\C\\Dee"
    },"51": {
        "namespace": "X\\Wye"
    },"26": {
        "namespace": "X\\Y\\Zed"
    }
}

我在想我可能必须为每个级别的命名空间构建一个单独的数组并对其进行单独排序，但是在如何做到这一点上有一个空白。有什么建议可以完成这项工作的最后一步，或者有什么完全不涉及很多循环的东西？

解决方法

我相信以下方法应该有效：

uasort($indexes,static function (array $entry1,array $entry2): int {  
    $ns1Parts = explode('\\',$entry1['namespace']);
    $ns2Parts = explode('\\',$entry2['namespace']);

    $ns1Length = count($ns1Parts);
    $ns2Length = count($ns2Parts);

    for ($i = 0; $i < $ns1Length && isset($ns2Parts[$i]); $i++) {
        $isLastPartForNs1 = $i === $ns1Length - 1;
        $isLastPartForNs2 = $i === $ns2Length - 1;

        if ($isLastPartForNs1 !== $isLastPartForNs2) {
            return $isLastPartForNs1 <=> $isLastPartForNs2;
        }

        $nsComparison = $ns1Parts[$i] <=> $ns2Parts[$i];

        if ($nsComparison !== 0) {
            return $nsComparison;
        }
    }

    return 0;
});

它的作用是：

将名称空间拆分为多个部分
比较从第一个部分开始的每个部分，然后：
- 如果我们只讨论最后一部分而不是另一部分，则优先考虑具有最多部分的那一部分，
- 否则，如果各自的部分不同，则按字母顺序优先于另一部分。

Demo

我们将其分为4个步骤。

步骤1：从数据集中创建层次结构。

<source>: In function 'int main()':

<source>:18:73: error: no matching function for call to 'prod(const matrix_type&,const matrix_type&)'

          boost::numeric::ublas::prod(rotate.matrix(),translate.matrix())

                                                                         ^

In file included from <source>:5:0:

/celibs/boost_1_65_0/boost/numeric/ublas/matrix_expression.hpp:4281:5: note: candidate: template<class E1,class E2> typename   
boost::numeric::ublas::matrix_vector_binary1_traits<typename E1::value_type,E1,typename E2::value_type,E2>::result_type 
boost::numeric::ublas::prod(const boost::numeric::ublas::matrix_expression<E>&,const boost::numeric::ublas::vector_expression<E2>&,boost::numeric::ublas::unknown_storage_tag,boost::numeric::ublas::row_major_tag)

     prod (const matrix_expression<E1> &e1,^~~~

用function createHierarchicalStructure($indexes){ $data = []; foreach($indexes as $d){ $temp = &$data; foreach(explode("\\",$d['namespace']) as $namespace){ if(!isset($temp[$namespace])){ $temp[$namespace] = []; } $temp = &$temp[$namespace]; } } return $data; }分隔名称空间，并维护一个\\变量。使用$data地址引用来继续编辑数组的相同副本。

步骤2：对第一个文件夹中的层次结构进行排序，然后对文件进行排序。

对下级文件夹进行排序，并将function fileSystemSorting(&$indexes){ foreach($indexes as $key => &$value){ fileSystemSorting($value); } uksort($indexes,function($key1,$key2) use ($indexes){ if(count($indexes[$key1]) == 0 && count($indexes[$key2]) > 0) return 1; if(count($indexes[$key2]) == 0 && count($indexes[$key1]) > 0) return -1; return strnatcmp($key1,$key2); }); }用于当前级别的文件夹。反之亦然。如果比较的两个文件夹都具有子文件夹，则将它们作为字符串进行比较，否则，如果一个是文件夹而另一个是文件，则使文件夹位于上方。

第3步：现在，按顺序整理层次结构。

uksort

步骤4：将原始数据密钥还原回去并返回结果。

function flattenFileSystemResults($hierarchical_data){
    $result = [];
    foreach($hierarchical_data as $key => $value){
        if(count($value) > 0){
            $sub_result = flattenFileSystemResults($value);
            foreach($sub_result as $r){
                $result[] = $key . "\\" . $r;
            }   
        }else{
            $result[] = $key;
        }
    }
    
    return $result;
}

驱动程序代码：

function associateKeys($data,$indexes){
    $map = array_combine(array_column($indexes,'namespace'),array_keys($indexes));
    $result = [];
    foreach($data as $val){
        $result[ $map[$val] ] = ['namespace' => $val];
    }
    return $result;
}

演示：

这是另一个将步骤进一步细分的版本，尽管它可能不是最理想的，但绝对可以帮助我的大脑进行思考。有关正在发生的事情的更多详细信息，请参见评论：

uasort(
    $indexes,static function (array $a,array $b) {

        $aPath = $a['namespace'];
        $bPath = $b['namespace'];

        // Just in case there are duplicates
        if ($aPath === $bPath) {
            return 0;
        }

        // Break into parts
        $aParts = explode('\\',$aPath);
        $bParts = explode('\\',$bPath);

        // If we only have a single thing then it is a root-level,just compare the item
        if (1 === count($aParts) && 1 === count($bParts)) {
            return $aPath <=> $bPath;
        }

        // Get the class and namespace (file and folder) parts
        $aClass = array_pop($aParts);
        $bClass = array_pop($bParts);

        $aNamespace = implode('\\',$aParts);
        $bNamespace = implode('\\',$bParts);

        // If the namespaces are the same,sort by class name
        if ($aNamespace === $bNamespace) {
            return $aClass <=> $bClass;
        }

        // If the first namespace _starts_ with the second namespace,sort it first
        if (0 === mb_strpos($aNamespace,$bNamespace)) {
            return -1;
        }

        // Same as above but the other way
        if (0 === mb_strpos($bNamespace,$aNamespace)) {
            return 1;
        }

        // Just only by namespace
        return $aNamespace <=> $bNamespace;
    }
);

Online demo

我发现Jeto的算法设计没有错，但是我决定更简洁地实现它。我的代码段避免了for()循环中的迭代函数调用和算术运算，而是使用单个太空飞船运算符和单个返回。我的代码段短了50％以上，而且我通常觉得它更容易阅读，但后来所有人都认为自己的宝宝很可爱，对吧？

代码：（Demo）

uasort($indexes,function($a,$b) {  
    $aParts = explode('\\',$a['namespace']);
    $bParts = explode('\\',$b['namespace']);
    $aLast = count($aParts) - 1;
    $bLast = count($bParts) - 1;

    for ($cmp = 0,$i = 0; $i <= $aLast && !$cmp; ++$i) {
        $cmp = [$i === $aLast,$aParts[$i]] <=> [$i === $bLast,$bParts[$i]];
    }
    return $cmp;
});

就像Jeto的答案一样，它同时迭代每个数组，并且

检查任一元素是否为数组的最后一个元素，如果是，则将其降到列表的下方（因为我们希望更长的路径来赢得决胜局）；
如果两个元素都不是其数组的最后一个，则按字母顺序比较元素的当前字符串。
重复此过程，直到生成非零评估。
由于预计不会出现重复的条目，因此返回值应始终为-1或1（决不要为0）

请注意，我有两个条件要暂停for()循环。

如果$i大于$aParts中的元素数（仅）-因为如果$bParts的元素数少于$aParts，则$cmp会在触发通知之前生成一个非零值。
如果$cmp是一个非零值。

最后，要解释太空飞船运算符两侧的数组语法...
太空飞船运算符将从左到右比较数组，因此其行为类似于：

leftside[0] <=> rightside[0] then leftside[1] <=> rightside[1]

以这种方式进行多次比较不会影响性能，因为<=>的任何一侧都没有函数调用。如果涉及到函数调用，则以回退方式进行单个比较的性能会更高：

fun($a1) <=> fun($b1) ?: fun($a2) <=> fun($b2)

这样，只有在需要平局的情况下，才进行后续函数调用。

按任意数量的子字符串对字符串数组进行排序

如何解决按任意数量的子字符串对字符串数组进行排序

解决方法

相关推荐