python文件遍历与匹配过滤

mac2024-12-03  24

路径/home/ghost/workspace/Other/结构如下

├── git ├── input │ ├── csv │ │ ├── test_file_1.csv │ │ └── test_file_2.csv │ ├── test.csv │ ├── test_file_1.txt │ └── test_file_2.txt ├── input-archive └── temp

现用python进行文件遍历与过滤

显示当前层级文件夹

import os work_folder = '/home/ghost/workspace/Other' os.listdir(work_folder)

结果如下:

['input', 'temp', 'input-archive', 'git']

遍历所有文件和文件夹(含子文件夹)

如下代码中root是基准文件夹,dirs是基准文件夹下的文件夹,files为基准文件夹下的文件

exclude = ['git','temp'] # 遍历时希望排除的文件夹 for root, dirs, files in os.walk(work_folder): for ex in exclude: if ex in dirs: dirs.remove(ex) # 移除 dirs 中不想继续遍历的文件夹 print(root,dirs,files)

结果如下:

/home/ghost/workspace/Other ['input', 'input-archive', 'git'] [] /home/ghost/workspace/Other/input ['csv'] ['test_file_1.txt', 'test_file_2.txt', 'test.csv'] /home/ghost/workspace/Other/input/csv [] ['test_file_2.csv', 'test_file_1.csv'] /home/ghost/workspace/Other/input-archive [] []

模式匹配过滤文件

找出目录下所有csv文件(含子目录),这里用到glob模块,recursive为True配合 ** 符号代表递归向下搜索。

import glob pat = '/home/ghost/workspace/Other/input/**/*.csv' for csv in glob.glob(pat,recursive = True): print(csv)

结果如下:

/home/ghost/workspace/Other/input/test.csv /home/ghost/workspace/Other/input/csv/test_file_2.csv /home/ghost/workspace/Other/input/csv/test_file_1.csv
最新回复(0)