Notebook

これは日々の作業を通して学んだことや毎日の生活で気づいたことをを記録しておく備忘録である。

HTML ファイル生成日時: 2024/11/21 17:40:55.112 (台灣標準時)

Python で gzip で圧縮されたファイルを読む方法 (2023 年 07 月中旬)

Python で gzip で圧縮されたファイルを読む方法は以下の通りでござる。こ こでは、 MPCORB.DAT.gz というファイルを読み込むことを考えるでござる。 <import gzip> して、 <gzip.open (filename, 'r')> とした後、 gzip.open が返すファイルハンドルを使ってファイルの内容を読めばよいよう でござる。


#!/usr/pkg/bin/python3.10

#
# Time-stamp: <2023/07/17 13:31:53 (CST) daisuke>
#

# importing gzip module
import gzip

# file to read
file_gzipped = 'MPCORB.DAT.gz'

# opening file
with gzip.open (file_gzipped, 'r') as fh_gz:
    # reading file
    data = fh_gz.readlines ()
    # printing first 20 lines
    for line_rawbytes in data[:20]:
        # conversion from raw bytes into UTF-8 string
        line_utf8 = line_rawbytes.decode ()
        # printing a line
        print (f'{line_utf8}', end='')

注意が必要だった点は、 gzip.open を使うと、読み込んだファイルの内容は raw bytes になることでござった。 .decode () を使って UTF-8 に変換して あげないといけないでござる。

実行結果は以下の通りでござる。


% ./read_gz_file.py
MINOR PLANET CENTER ORBIT DATABASE (MPCORB)

This file contains published orbital elements for all numbered and unnumbered
multi-opposition minor planets for which it is possible to make reasonable
predictions.  It also includes published elements for recent one-opposition
minor planets and is intended to be complete through the last issued Daily
Orbit Update MPEC.  As such it is intended to be of interest primarily
to astrometric observers.

   Software programs may include this datafile amongst their datasets, as
   long as this header is included (it is acceptable if it is contained
   in a file separate from the actual data) and that proper attribution
   to the Minor Planet Center is given.  Credit to the individual orbit
   computers is implicit by the inclusion of a reference and the name of
   the orbit computer on each orbit record.  Information on how to obtain
   updated copies of the datafile must also be included.

   The work of the individual astrometric observers, without whom none of
   the work of the Minor Planet Center would be possible, is gratefully
   acknowledged.  Credit to the individual observers is implicit by the

fig_202307/xterm_gzip_00.png


Frequently accessed files

  1. Computer___Python/20220518_0.html
  2. Computer___Network/20230726_00.html
  3. Misc___Taiwan/20240207_00.html
  4. Computer___Network/20230516_00.html
  5. Computer___FreeBSD/20220621_0.html
  6. Computer___Python/20220715_0.html
  7. Computer___Network/20230508_00.html
  8. Food___Taiwan/20220429_0.html
  9. Computer___NetBSD/20220817_3.html
  10. Computer___Python/20220410_0.html
  11. Computer___Network/20240416_00.html
  12. Computer___Network/20240130_00.html
  13. Computer___Debian/20210223_1.html
  14. Computer___NetBSD/20230119_00.html
  15. Computer___Python/20210124_0.html
  16. Computer___Python/20221013_0.html
  17. Computer___NetBSD/20220818_1.html
  18. Computer___NetBSD/20220428_0.html
  19. Science___Math/20220420_0.html
  20. Computer___NetBSD/20240101_02.html
  21. Computer___NetBSD/20220808_0.html
  22. Computer___TeX/20230503_00.html
  23. Computer___NetBSD/20230515_00.html
  24. Science___Astronomy/20220503_0.html
  25. Computer___NetBSD/20210127_0.html
  26. Computer___Python/20240101_00.html
  27. Computer___Network/20220413_1.html
  28. Computer___Python/20220816_1.html
  29. Computer___NetBSD/20210204_0.html
  30. Travel___Taiwan/20220809_2.html


HTML file generated by Kinoshita Daisuke.