如何从csv文件中提取一个numpy数组？

如何解决如何从csv文件中提取一个numpy数组？

对于作业，我必须使用NumPy从CSV文件中提取数据。该文件包含多行，但是第一行包含标签，看起来像 label,pixel1,pixel2,pixel3,...,pixel785-这个应该被忽略。接下来的行在第一个单元格中包含一个标签（我相信是1-10之间的某个整数），接下来的784个单元格包含实际的像素数值。这些数字必须重塑为28x28数组。

该函数应该返回2种np.array类型，一种带有标签，一种带有图像，并且输出应如下所示：

(27455,28,28)
(27455,)
(7172,28)
(7172,)

到目前为止，这就是我所拥有的。我设法将像素值放入28x28阵列中（我认为），但是我不确定如何从那里开始。该作业建议使用np.as_type（），因为我需要将这些值转换为浮点数。

我从未在NumPy中使用过数组，所以我不确定如何使用它们。我是否正确地做了第一部分？如何退回图像和标签？

（由于我正试图理解所有概念和建议，因此请在回答时留在作业范围内。由于我已经在为此苦苦挣扎，因此我不希望找到其他可能的解决方案。！）

def get_data(filename):
  # You will need to write code that will read the file passed
  # into this function. The first line contains the column headers
  # so you should ignore it
  # Each successive line contains 785 comma separated values between 0 and 255
  # The first value is the label
  # The rest are the pixel values for that picture
  # The function will return 2 np.array types. One with all the labels
  # One with all the images
    #
  # Tips: 
  # If you read a full line (as 'row') then row[0] has the label
  # and row[1:785] has the 784 pixel values
  # Take a look at np.array_split to turn the 784 pixels into 28x28
  # You are reading in strings,but need the values to be floats
  # Check out np.array().astype for a conversion
    with open(filename,'r') as training_file:
      # Your code starts here
        #training_file.readline()
        csv_reader = csv.reader(training_file)
        header=next(csv_reader)

        if header != None:

            for row in csv_reader:
                images=np.array_split(row[1:],28)
      # Your code ends here
    return images,labels

path_sign_mnist_train = f"{getcwd()}/../tmp2/sign_mnist_train.csv"
path_sign_mnist_test = f"{getcwd()}/../tmp2/sign_mnist_test.csv"
training_images,training_labels = get_data(path_sign_mnist_train)
testing_images,testing_labels = get_data(path_sign_mnist_test)

# Keep these
print(training_images.shape)
print(training_labels.shape)
print(testing_images.shape)
print(testing_labels.shape)

# Their output should be:
# (27455,28)
# (27455,)
# (7172,28)
# (7172,)

解决方法

认为这可以解决问题

 def get_data(filename):
    # You will need to write code that will read the file passed
    # into this function. The first line contains the column headers
    # so you should ignore it
    # Each successive line contains 785 comma separated values between 0 and 255
    # The first value is the label
    # The rest are the pixel values for that picture
    # The function will return 2 np.array types. One with all the labels
    # One with all the images
    #
    # Tips:
    # If you read a full line (as 'row') then row[0] has the label
    # and row[1:785] has the 784 pixel values
    # Take a look at np.array_split to turn the 784 pixels into 28x28
    # You are reading in strings,but need the values to be floats
    # Check out np.array().astype for a conversion
    with open(filename,"r") as training_file:
        # Your code starts here
        # training_file.readline()
        csv_reader = csv.reader(training_file)  # None makes skip headers
        next(csv_reader,None)  # skip the headers
        images = []
        labels = []
        for row in csv_reader:
            images.append(np.array(row[1:]).reshape(28,28))
            labels.append(row[0])

    images = np.array(images).astype(np.float32)
    labels = np.array(labels).astype(np.float32)
    # Your code ends here
    return images,labels

如何从csv文件中提取一个numpy数组？

如何解决如何从csv文件中提取一个numpy数组？

解决方法

相关推荐