[ukai] A crow may be drowned by aping a cormorant
Fumitoshi UKAI's hacking life

Sun, 04 Sep 2005

fsck for old snapshot.debian.net is completed, but...

I've noticed that long fsck on disks for old snapshot.debian.net is completed today. It takes 75 days! Anyway, it didn't end successfully. It seems that some filesystem erros remain. By forcing mount the filesystem, I couldn't any files but lost+found, and it did't finish to list in lost+found directories, not only filesystem still had errors but also too many files were moved into lost+found. I'm afraid it's impossible to restore old archive...

[15:00] | [] | # (1) | TB| G

Fri, 22 Jul 2005

1 month fsck...

It still fsck for a month....

root      6235 36.1 59.7 1080080 307808 pts/2 D+  Jun21 15911:50 fsck.ext3 /dev/md5

How much time does it need to be finished?

[00:00] | [] | # (2) | TB| G

Thu, 21 Jul 2005

BTS version tracking from snapshot.debian.net

snapshot.debian.net now has BTS version tracking in package page.

For example, you can find link to BTS in each version of ssh package on http://snapshot.debian.net/package/ssh .

[00:20] | [] | # (0) | TB| G

Thu, 23 Jun 2005

debdelta consideration for snapshot.debian.net

I'm considering debdelta, which would manage efficient delta between deb files, for snapshot.debian.net to utilize disk space more efficiently.

I wrote short shell script using rdiff or xdelta and tried how much efficient delta can be generated. It also regenerate exact same deb file from base deb file and delta file.

For example:

 22473302 tetex-base_3.0-3_all.deb
 22473170 tetex-base_3.0-2_all.deb

delta between these 2 debs
   231706 t.rdiff
   184444 t.xdelta
   9055368 evolution_2.2.2-4_i386.deb
   9055634 evolution_2.2.2-3_i386.deb

delta between these 2 debs
    575034 e.rdiff
    341246 e.xdelta
  43838036 openoffice.org-bin_1.1.3-8_i386.deb
  43832154 openoffice.org-bin_1.1.3-9_i386.deb

delta between these 2 debs
   9984718 ooo.rdiff
   6728138 ooo.xdelta
  43879974 openoffice.org-bin_1.1.4-1_i386.deb
  43789874 openoffice.org-bin_1.1.4-2_i386.deb

delta between these 2 debs
  13645574 ooo4.rdiff
   8307308 ooo4.xdelta

Hmm, most case xdelta is better than rdiff. Anyway, I should rewrite the script more better, especially more robust. Current major issues is that it is too slow to process (both take delta and apply delta) so it would not be suitable to apply all deb files in snapshot.debian.net's pool.

By the way, while I was testing the script, I found https://wiki.ubuntu.com/APTPackageDeltas. Hmm, it does very similar way to take delta, but it would be somewhat different objective than mine. My script tries to generate the same deb as original deb (it may not be perfect yet though).

I also find another debdelta.

[01:34] | [] | # (0) | TB| G

Sun, 19 Jun 2005

may lost old snapshot.debian.net archive...

gniibe trid to repair the disks, but got in trouble. (md device would recover from disks including new one to good one, so good one would be overridden with bad data.)

We tried to fix this situation. Fscking on 2nd alternate superblocks, but it keep fixing huge number of inodes... Maybe inode bitmap and group blocks has been broken, so most files would be lost...

At the worst, we lost old snapshot before 3 months ago. In more details, we have old old snapshot since 2002/06/04 - 2004/02/26 on another machine so these are still available, but 2004/02/27 - 2005/03/12 may be disappeared...

[23:30] | [] | # (0) | TB| G

Sat, 18 Jun 2005

Failed to repair broken disk ...

After FSIJ monthly seminar, we discussed possibility of "summer of code" in Japan. What theme is good for "summer of code"?

Anyway, since a disk is broken in snapshot.debian.net archive, we tried to fix this issue.

gniibe, knok and ukai go to Akihabara to buy 250GB HDD (WD 2500JB) and have supper at Kitchen Jiro.

Then, we go to Ootemachi NOC.

Here is record

# cat /proc/mdstat
Personalities : [raid1] [raid5] [multipath]
read_ahead 1024 sectors
md5 : active raid5 sdh1[3](F) sdg1[2] sdf1[1] sde1[0] sdd1[7] sdc1[6] sdb1[5] sda1[4]
      1709370880 blocks level 5, 128k chunk, algorithm 0 [8/7] [UUU_UUUU]
(snip)
# mdadm --detail /dev/md5
/dev/md5:
        Version : 00.90.00
  Creation Time : Sun May  9 20:27:05 2004
     Raid Level : raid5
     Array Size : 1709370880 (1630.18 GiB 1750.40 GB)
    Device Size : 244195840 (232.88 GiB 250.06 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 5
    Persistence : Superblock is persistent

    Update Time : Fri May 20 01:46:33 2005
          State : active, degraded
 Active Devices : 7
Working Devices : 7
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-asymmetric
     Chunk Size : 128K

           UUID : 57db5f7d:550a2aed:f3dc21f8:2dc771c6
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       8       65        0      active sync   /dev/sde1
       1       8       81        1      active sync   /dev/sdf1
       2       8       97        2      active sync   /dev/sdg1
       3       8      113        3      faulty   /dev/sdh1
       4       8        1        4      active sync   /dev/sda1
       5       8       17        5      active sync   /dev/sdb1
       6       8       33        6      active sync   /dev/sdc1
       7       8       49        7      active sync   /dev/sdd1

Hmm, /dev/sdh1 is faulty. So, I remove it from /dev/md5 and shutdown the machine.

# mdadm /dev/md5 -r /dev/sdh1
# shutdown -h now

Unfortunately, we don't have a key to open the HDD case! That is, broken 250GB HDD is locked in the USB box! gniibe say the key should be in the office at Bunkyo Green Court. Bravely, we decompose it and try to open it by force. We finally replace the disk with new one. Reboot the machine.

The disk is recognized successfully, However, it fails to access the disk. I can't even fdisk on the disk. Oh my god... We discuss and conclude that opening the HDD box by force is the cause of the failure. Then, there are nothing to do today. So, I configure the RAID5 with only 7 disks and mount as read only.

# mdadm --assemble /dev/md5 --run /dev/sd{a,b,c,d,e,f,g}1
# mount -o ro /dev/md5 /archive

gniibe will find the key in the office and replace/repair the HDD box tomorrow. Thanks!

[20:05] | [] | # (0) | TB| G

Fri, 17 Jun 2005

Disk failure in RAID5 of snapshot.debian.net

I've noticed that a disk in RAID5 has been disabled due to the I/O error..

scsi7: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 02 ab 2d 3f 00 00 f8 00
Current sd08:71: sense key Medium Error
Additional sense indicates Unrecovered read error
 I/O error: dev 08:71, sector 44772608
raid5: Disk failure on sdh1, disabling device. Operation continuing on 7 devices
md: recovery thread got woken up ...
md: updating md5 RAID superblock on device
md: (skipping faulty sdh1 )
md: sdg1 [events: 00000004]<6>(write) sdg1's sb offset: 244195904
md5: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: sdf1 [events: 00000004]<6>(write) sdf1's sb offset: 244195904
md: sde1 [events: 00000004]<6>(write) sde1's sb offset: 244195904
md: sdd1 [events: 00000004]<6>(write) sdd1's sb offset: 244195904
md: sdc1 [events: 00000004]<6>(write) sdc1's sb offset: 244195904
md: sdb1 [events: 00000004]<6>(write) sdb1's sb offset: 244195904
md: sda1 [events: 00000004]<6>(write) sda1's sb offset: 244195904

Hmm, we (FSIJ) need to prepare new disks for snapshot.debian.net, not only it will be short on disk space (1.4T used, 177G avail 89% used) but also it has no more spare disks on RAID5...

[01:30] | [] | # (0) | TB| G

Sat, 26 Mar 2005

snapshot.debian.net reaches 1K days!

Today, snapshot.debian.net reaches 1K days since 2002/06/04.

 % ruby -rdate -e 'puts Date.parse("2002/06/04")+1024'
 2005-03-24
 % ruby -rdate -e 'puts Date.parse("2005/03/24")-Date.parse("2002/06/04")'
 1024

It requires 1.4T disk space to hold 1K days debian archives. Unfortunately, some of days are lost bacause of mirror trouble or so.

Anyway, woody was released on 2002/07/19, so 1000 days of woody (2005/04/14) or 1K days of woody (2005/05/08) will be soon...

Update on 2005/03/26, here is df(1) result:

 $ df .
 Filesystem           1K-blocks      Used Available Use% Mounted on
 /dev/md5             1682548700 1298196556 298883600  82% /archive
 $ df -H .
 Filesystem             Size   Used  Avail Use% Mounted on
 /dev/md5               1.8T   1.4T   307G  82% /archive

[03:32] | [] | # (0) | TB| G

HTTrack attack(?) from 218.94.37.0/24

I noticed that snapshot.debian.net somehow slow, and by checking access.log, 218.94.37.185 tries to get all(?) contents using HTTrack 3.0x ("Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"). It's reckless access, isn't it?

In some time, he would fail in disk full. However, I think it makes snapshot.debian.net slow down. It is not bandwidth limitation (100Mbps FTTH), but probably poor machine resources in today (Pentium III 600MHz/256k cache, 512MB mem, Intel EthernetPro 100).

For the time being, I reject the access from 218.94.37.0/24.

[03:27] | [] | # (0) | TB| G

Wed, 16 Mar 2005

Failed to mirror March 15th...

I've notified from Mutsumi who has downstream mirror hanzubon.jp from snapshot.debian.net that snapshot.debian.net failed to mirror on March 15th.

I'm afraid this is because harddisk error, because current drive had report some errors in past (and can be seen it by dmesg). And I'm in Korea now, so I can't repair it so soon....

Anyway, the reason of its failure is that snapshot tried to get from upstream while upstream was mirroring, so that some files had been disappeared in rsyncing and rsync exits error.

I tried to recover yesterday's debian from ftp.de.debian.org, but it may already updated..

[13:57] | [] | # (0) | TB| G

< September 2005 >
SuMoTuWeThFrSa
     1 2 3
4 5 6 7 8 910
11121314151617
18192021222324
252627282930 

Categories

Archives

My Sites
ukai.jp
mu
me
Diary
[rss]

Web Sites
Debian
Debian JP
snapshot.debian.net
Japan Linux Association
Free Software Initiative of Japan

powered by pyblosxom