logs archiveIRC Archive / Freenode / #centos / 2007 / January / 14 / 1
flippo
Made an interesting discovery after installing 4.4: "kernel: CPU0: Temperature above threshold"
I guess I better open up the box and clean out the dust bunnies
Never saw that message with fc3
It does explain why the system seemed to have slowed down: "kernel: CPU0: Running in modulated clock mode"
faust_
flippo: had the same deal with my desktop machine.
When I checked the cpu cooler, there was a solid 2mm layer of dust between the fan and the heatsink, making sure no cool air got through.
flippo
faust_, yikes
I better not wait another moment. yum -y update is finished.
faust_
No wonder it kept rebooting on me :)
AMD system, so no warning. Just shutdown.
flippo
Time to do a little vacuuming.
faust_
Bunnykiller ftw :)
ke4qqq
hey guys, running centos 4.4, have a weird nic situation, I have tried 4 nics (2 Marvell chipsets, and 2 Realtek chipsets) the nic shows link state as active per ethtool/dmesg, using ethereal it sees and even responds to traffic (ARP only as no other machine sees any traffic from it) Any chance this is something obvious (or not so) that someone has dealt with before?
I have also tried multiple cables, multiple switch ports and even multiple switches
and nada.....to complicate things, I have two other nics on the same box (different subnet) which work flawlessly.
alleycat
ke4qqq: and you are sure there are no routing issues, like in packets going out on a wrong interface
or arp_filter in action ?
ke4qqq
I thought that might be a problem, but even taking down all of the other interfaces doesn't resolve it
I have also checked route to make sure it wasn't a problem
don't know about arp_filter, let me check that
alleycat: arp_filter is off for the nic in question as well as the other nics
         

alleycat
ke4qqq: what motherboard is that ?
ke4qqq
it's dell 715n, don't recall what chipset/mobo that is.
alleycat
the reason I ask is that there are motherboards
which do not provide same hardware resources to all slots
I had situations where the very same pci card worked in 2 slots, but not in the other 2
some slots have "master" capabilities, some not... something like that
like in master busmaster DMA
of, typo error, make it "like in busmaster DMA"
ke4qqq
anyway of using lspci or something else to determine this?
alleycat
maybe lspci -v, maybe dmidecode
I have no idea, to say the truth
I have just educated myself to prefer the slots beighboring the agp slot
i really need a smarter keyboard, which would correct my typing errors
ke4qqq
hmmm lspci shows it as a bus master, and identical settings to the working nics, doesn't appear to be sharing IRQ.....
dmidecode doesn't really tlel me anything
alleycat
lspci will show you what the card says
not what the mobo provides
ke4qqq
ahhhh ok
alleycat: follow up to that question, I have this line when I run ifconfig for the nic in question TX packets:506 errors:0 dropped:20685 overruns:0 carrier:0 what would the reason for the drops be?? I have iptables filtering nothing (service iptables stop)
alleycat
no, that drop is at a lower level then iptables would intervene
it's layer1 or 2
Daemoen
hey guys, power flickered this morning and system went down, now it wants an fsck... try to run it from the livecd, and it complains because i am using lvm... how do you fsck lvm?
alleycat
looks like transmission errors
ke4qqq
thats what I thought, but perhaps all the traffic isbeing dropped
but errors = 0
alleycat
Daemoen: flsck /dev/mapper/volname
fsck
Daemoen
k.
hopefully this works... as i cant afford to lose the partitions its griping about
alleycat
use the version of lvm tools on the cd to find out the name
most probably there is a static version
witch requires smtg similar to
lvm.static required_function
in order to be used
Daemoen
(Action) thinks alleycat has had too much time to play with lvm issues :-D
( but thats a good thing )
alleycat
Daemoen: need is an excellent teacher
ke4qqq
lvm is a wonderful thing.....
alleycat
I have a machine with 10 disks
and volumes spread all over
Daemoen
hmmm, most any of my machines have is 6 drives
         

ke4qqq
LVM is wonderful for deploying iSCSI volumes....I have LVM atop 8 disks being shared out over iSCSI
alleycat
that's a machine with a "few" xen guests
raid5 among drives, that sort of stuff
ke4qqq
alleycat: do you use md and then LVM atop that?
or some other method to get RAID 5?
Daemoen
md?
alleycat
ke4qqq: yes
ke4qqq
Daemoen: multiple device, ie software raid in linux
Daemoen
ahh
never messed with software side :)
alleycat
ke4qqq: as well as md on top of lvm volumes :)
Daemoen
if gonna play with raid, might as well have a raid controller
(Action) sighs
alleycat
Daemoen: depends heavily on the budget
you get two computers with the price of one controller which knows 8 disks
ke4qqq
Daemoen: but what if you raid controller goes bad, and is no longer produced.....or even if it is still produced, but takes a week to get to you..... software raid allows use on virtually any hardware
Daemoen
"fsck.ext2: Attempt to read block from filesystem resulted in short read while trying to open /dev/VolGroup00/Web Could this be a zero-length partition"
(Action) sighs
(Action) has a feeling he can kiss the data goodbye at this point, which is a VERY VERY bad thing
alleycat
Daemoen: NO
Daemoen: the data is there, don't screw it
Daemoen
im not :-D
alleycat
just because NOW you are not familiar with the tools
you just have to learn
Daemoen
thats why i came here, ive never had this issue before, so figured this was the best chance to learn :)
I would be most appreciative of guidance from this point :)
alleycat
Daemoen: what does vgdisplay say ?
ke4qqq: dumb question
ke4qqq: you said you have tested multiple cables and switch ports
ke4qqq
alleycat: yes
alleycat
did you happen to use the same socket all the time ?
ke4qqq
you mean pci slot? yes, as there is only one available on this mobo (1U RM box)
alleycat
like in computer - patch cord - socket - cable - patch panel - patch cord -switch
Daemoen
it says: VG Name: VolGroup00, format lvm 2, metadata areas 2, metadata seq 18, vg access: rw, status resizable, max lv 0, cur lv 13, cur pv 2, act pv 2, vg size: 297.97 GB, pe size 32MB, total pe: 9535, alloc pe / size: 9535 / 297.97 GB free pe / size 0 / 0, vg uuid: caMRao-a06T-FgXk-3CdB-g51d-w63P-TlusRJ
alleycat
no, I was speaking about a RJ45 socket in the wall
Daemoen
( was trying to keep that less spammy so sorry for the messy format )
ke4qqq
ohhh no sorry, patch cable really isn't a patch cable, just a cable from nic to switch
this is all in the same rack
flippo
Darn, the kernel is comlaining about the temperature again. Evidently, getting rid of dust bunnies was not enough.
alleycat
i see...
strange
Daemoen
(Action) wonders who the strange is for...
alleycat
Daemoen: it was for ke4qqq
Daemoen: looks fine. what does lvdisplay say ?
ke4qqq: the drops in ifconfig usually point to an error on the cabling side
Daemoen
lots, lol
one sec, lemme open up a browser and stuff on that machine
paste the contents at pastebine
alleycat
are you sure the speed and the rest of the settings match between linux and the switch ?
full/half duplex etc
Daemoen
http://rafb.net/p/eaZAki13.html
alleycat
hum... fsck /dev/VolGroup00/Web should have worked
Daemoen: please try dumpe2fs -h /dev/VolGroup00/Web
Daemoen
also, i dont know how much of a difference this would make, but a long while back i used one of the programs avail in the kde menus to delete the /Home6 and /Web6, after deleting those two, i created /Vmachines with the space that was freed
whether that would have caused any problems that were delayed i dont know
"dumpe2fs: Attempt to read block from filesystem resulted in short read while trying to open /dev/VolGroup00/Web Couldn't find valid filesystem superblock
alleycat
Daemoen: deleting/recreating should not matter
Daemoen: could you please try to run dumpe2fs -h against other volumes?
just let me know if it works or you get more errors
it seems that the first superblock of the Web volume is not valid
Daemoen
all checked, only /Web is doing it
ke4qqq
alleycat: yeah it's 1G full duplex.... which is what the switch supports....per ethtool anyway
alleycat
ke4qqq: try setting the speed manually on both sides
and start slow, like in 10 Mpbs/half for instance
by the way
what does the switch say about the interface? any errors
Daemoen: read the man page of fsck
Daemoen
since centos doesnt have newfs, what is the command that will list all the superblocks on the /Web :)
i already was :)
alleycat
and try to search for an alternate sb
ke4qqq
good question, let me look at the switch
alleycat: you won't believe this, but it's the only unmanaged switch in the place (that I know of)
alleycat
i do believe... no free ports in another switch ?
ke4qqq
yeah, I am looking to see where I can plug it in at....
alleycat
we are not on the right channel, I could tell you stories about IBM laptops refusing to connect to HP switches... with a dumb ATI switch in between, all is good
ke4qqq
interesting, I just ran ethtool -S and got this line: align_errors: 4635
Daemoen
(Action) boggles
it seems the fsck on centos doesnt allow you to specify a superblock if the primary is bad
alleycat
Daemoen: yes it does
Daemoen
found it, you have to use e2fsck :-D nvm :)
alleycat
-b
Daemoen
Bad magic number in super-block while trying to open /blahblahblah
found one! yay!
have you ever seen it fail to write a new superblock successfully alleycat ?
alleycat
Daemoen: only with defective hardware
amd so I started to avoid maxtor
*and so
ke4qqq
welll....I feel like a bit of an idiot....
alleycat
ke4qqq: yesterday I have won the prize "the dumb of the day" awarded by z00dax
« prev 1 2 3 4 5 6 7 8 next »