Difference between revisions of "Novena Issue Log"
(→Bringup status by subsystem) |
(→Hardware) |
||
Line 158: | Line 158: | ||
* 1Gbit refclock stability (sourced by the PHY chip) looks poor. Termination seems to be causing some signal amplitude degradation. Doesn't seem to have fundamental jitter, rather the waveform's shape is unstable which causes the crossing to shift as the waveform changes shape. This needs to be investigated further. Shorting across R21G improves signal integrity to the point where the link has both Tx and Rx stability on all 4 characterizeable boards; change done on all 5 boards. | * 1Gbit refclock stability (sourced by the PHY chip) looks poor. Termination seems to be causing some signal amplitude degradation. Doesn't seem to have fundamental jitter, rather the waveform's shape is unstable which causes the crossing to shift as the waveform changes shape. This needs to be investigated further. Shorting across R21G improves signal integrity to the point where the link has both Tx and Rx stability on all 4 characterizeable boards; change done on all 5 boards. | ||
+ | |||
+ | * HDMI HPD sense is inverted. Need to add Q17L, R29L, and swap R28L/R27L to fix the issue. Done to four boards (Jacob's board is unavailable). | ||
====DDR3 Bringup==== | ====DDR3 Bringup==== |
Revision as of 16:28, 20 January 2013
Contents
Bringup status by subsystem
Subsystem | basic status | extended status | notes |
---|---|---|---|
Sleep/suspend | Need CI testbench for sleep/suspend to verify extended status | ||
PMIC base | OK | voltages nominal, but on high side for CPU (1.35V) -- need PMIC DVFS driver to dial it back | |
PMIC advanced | needs driver for further testing | ||
RTC backup battery | |||
RTC | (extended status should also measure clock drift) | ||
debug console | OK | OK | |
SDHC3 (microSD boot) | OK | ||
SDHC3 power switch | |||
DDR3 base | OK | functions at 1066 MT/s, 4 GB 1 and 2 rank configured, using http://memtester.sourcearchive.com/documentation/4.0.8/files.html for testing in userspace | |
DDR3 extended | OK | tested with 1 and 2 rank DIMMs, 1-4 GB configurations | |
I2C1 (SMB) | OK | can read out DDR3 I2C config using i2cdump | |
I2C2 | OK | can read out accelerometer, PMIC bits using i2cdump | |
I2C3 | OK | ||
reset button | OK | ||
USB hub 1 | OK | tested with thumb drive, needs performance testing | |
USB hub 2 | OK | ditto | |
USB ext1 | OK | ||
USB ext1 power switch | |||
USB ext2 | OK | ||
USB ext2 power switch | |||
ASIX ethernet | OK | OK | 84,085,191 bytes in 18.53s = 36 Mbps (limited by external fiber uplink speed)
note that MAC address must be generated and assigned (not fixed in hardware ROM) |
Gbit ethernet | OK | NEEDS TUNING | 80+Mbps performance @ 100Mbit connection speed. 240+ Mbps performance @ 10000Mbit speed, possibly limited by test server capability or router saturation. Gbit ethernet extended status needs detailed NEXT, FEXT, jitter, etc. characterization before it can get a clean bill of health, I suspect there is more tuning to do.
note that MAC address must be generated and assigned (not fixed in hardware ROM) |
SDHC4 | |||
utility EEPROM | OK | use 16-bit mode for access. eeprom tools need some tweaking, just wrote a couple bytes and declared success. | |
audio base | |||
audio power switch | |||
speakers | |||
headphone | |||
analog mic in | |||
digital mic | |||
USB keyboard/mouse port | |||
USB keyboard/mouse power switch | |||
USB high current (1.5A) charging | |||
USB OTG | |||
HDMI | OK | Needs HPD to be inverted. | |
FPGA | |||
FPGA apoptosis option | |||
on-board USB wifi | |||
wifi power switch | |||
PCI-express | OK | Tested Atheros wifi card. Comes up on boot, can do long wifi transfers. | |
PCI-express power switch | |||
PCI-express embedded USB | |||
USIM | |||
USIM power switch | |||
LCD port | |||
LCD port USB | |||
LCD VCC power switch | |||
LCD backlight power switch | |||
touchscreen | |||
user function button | |||
uart 3 | |||
uart 4 | |||
accelerometer | OK | i2cset -y 0 0x1d 0x16 0x01; i2cdump -y -r 6-8 0 0x1d . seems to accelerometate just fine. | |
SATA | OK | 100 MHz clock must manually be taken out of reset: devmem2 0x020c80e0 w 0x80102001 | |
SATA power switch | |||
battery interface | |||
boot option headers | OK | Was able to load U-Boot from SDHC3, SDHC4, and SATA | |
JTAG | |||
FPGA SPI memory | |||
FPGA SPINOR memory | |||
FPGA ADC | |||
Rapsberry Pi peripheral header |
Power consumption notes
System at idle with no PM code running and 1GB standard RAM consumes 11.3V, 0.34A (measured at input regulator cap)
Known issues
Hardware
- Inrush current limiting for 3.3V_DELAYED turnon: R38N should be increased to about 30k. Need to verify with experiment turn-on timing margin (i.e. put smaller values in until failure to determine how much margin is available at 30k to ensure consistency across process variation)
- SN001 has 33k -- revised to 10k to match other boards
- SN004 has 10k (CPU did not boot at all with 47k)
- SN003 has 10k (PMIC does not respond to commands post-boot with 47k)
- SN002 has 10k, but has other problems preventing boot
- Fix bug where reset button does not work due to FPGA pulling boot fuses to look at SATA instead of SD card. Resolution is to change HSWAPEN to high, which turns off FPGA pull-ups. ECO applied to all 5 boards.
- 1Gbit PHY reset circuit (per Micrel datasheet) interferes with driver timing. The reset rises too slowly, driver checks MII status immediately and never retries dynamically (does a static read-out of MII data). Fix is to remove the local reset circuit. ECO applied to all 5 boards.
- PCIE_PWRON signal wasn't connected. Oops! In next rev, it will go to GPIO_16, pad R1. For now, it will be hard-wired to powered on. Fix done on all 5 boards.
- 1Gbit PHY magnetics termination is incorrect. Remove R14G to improve performance to gigabit-speeds. This was a lucky shot in the dark, however, I don't actually understand what's going on with the center-tap termination, and what the trade-offs are. The link still needs to be deeply characterized for NEXT, FEXT, etc. to verify that it is optimally terminated.
- 1Gbit refclock stability (sourced by the PHY chip) looks poor. Termination seems to be causing some signal amplitude degradation. Doesn't seem to have fundamental jitter, rather the waveform's shape is unstable which causes the crossing to shift as the waveform changes shape. This needs to be investigated further. Shorting across R21G improves signal integrity to the point where the link has both Tx and Rx stability on all 4 characterizeable boards; change done on all 5 boards.
- HDMI HPD sense is inverted. Need to add Q17L, R29L, and swap R28L/R27L to fix the issue. Done to four boards (Jacob's board is unavailable).
DDR3 Bringup
- See novena ddr3 notes for ddr3 bringup notes.
ECO list
- R38N change to 10k, 1% 0402 (resolve inrush current limit issue)
- R12F to DNP, R13F to 4.7k, 1% 0402 (resolve boot fuse issue with FPGA pull-ups)
- remove C32G, D12G, R20G, D11G; short across D12G with wire jumper or 0805 resistor, 0 ohm (resolve gbit ethernet reset issue)
- remove R11X, tie gate of Q10X to P3.3V_DELAYED (off of pin 2 of Q11X) (resolve PCIE power on issue)
- remove R14G (possibly resolve magnetics termination issue)
- replace R21G with 0-ohm resistor or shunt (partially resolve Gbit refclock stability issue)
- move R28L to R27L (HDMI HPD swap)
- add Q17L (HDMI HPD swap)
- add R29L (HDMI HPD swap)
Software
U-boot
- DDR3: need to come up with alternate poke files for different SO-DIMM types
- DDR3: need to figure out how to configure u-boot to recognize greater amounts of DRAM
- MMC: USDHC3 has to have the CD check return 1 at all times.
Linux
- MMC: device tree novena.dts descriptor edited to note that USDHC3 port is non-removable in order to enable boot
- Power: Driver for power currently assumes fixed regulators, an incorrect assumption. This needs to be changed to use the PFUZE PMIC. Currently, no drivers exist in the source tree (based off of the sabrelite). Recommend using Sabre board for smart devices as the base image instead of sabrelite - todo xobs
- USB/Power: PMIC does not turn on the USB VBUS by default, which causes internal root hub to fail. To fix this:
- add i2c2 to device tree (added to novena.dts)
- run this command to turn on the boost regulator:
i2cset -y 1 0x08 0x66 0x48
- once USB is on, I can verify/see USB drives in both USB ports. See this:
root@novena:~# lsusb Bus 002 Device 002: ID 05e3:0614 Genesys Logic, Inc. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 007: ID 0b95:772b ASIX Electronics Corp. Bus 002 Device 004: ID 05e3:0614 Genesys Logic, Inc. Bus 002 Device 006: ID 058f:6387 Alcor Micro Corp. Transcend JetFlash Flash Drive Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
- ASIX driver module isn't built into current image. Need to figure out how to turn that on.
- Configuring ASIX:
ip link set dev eth1 down ip link set dev eth1 address de:ad:fe:ed:00:01 ip link set dev eth1 up
Gbit Ethernet Characterization
- Performance results benchmarking 1000Gbit ethernet (run on SN004):
bunnie@crashbox:/var/www/bunnie$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55445 ------------------------------------------------------------ Client connecting to 10.0.39.241, TCP port 5001 TCP window size: 47.0 KByte (default) ------------------------------------------------------------ [ 6] local 10.0.39.142 port 38307 connected with 10.0.39.241 port 5001 [ ID] Interval Transfer Bandwidth [ 6] 0.0-10.0 sec 280 MBytes 234 Mbits/sec [ 4] 0.0-11.0 sec 6.01 MBytes 4.57 Mbits/sec [ 5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55446 ------------------------------------------------------------ Client connecting to 10.0.39.241, TCP port 5001 TCP window size: 47.0 KByte (default) ------------------------------------------------------------ [ 6] local 10.0.39.142 port 38308 connected with 10.0.39.241 port 5001 [ 6] 0.0-10.0 sec 286 MBytes 239 Mbits/sec [ 5] 0.0-10.1 sec 3.77 MBytes 3.12 Mbits/sec [ 4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55447 ------------------------------------------------------------ Client connecting to 10.0.39.241, TCP port 5001 TCP window size: 47.0 KByte (default) ------------------------------------------------------------ [ 6] local 10.0.39.142 port 38309 connected with 10.0.39.241 port 5001 [ 6] 0.0-10.0 sec 286 MBytes 240 Mbits/sec [ 4] 0.0-10.1 sec 3.48 MBytes 2.88 Mbits/sec [ 5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55448 from running root@novena:~# iperf -c 10.0.39.142 -d three times over
Server is a Supermicro X9SCL with a Xeon E3-1230 CPU and 8GB ram, running off of an SSD using Ubuntu 12.04.1 LTS.
Both client and server are plugged into a TP-Link 5-port gigabit desktop switch, TL-SG1005D.
Not sure why the asymmetry in the result. Seems to be real, the upload speed from Novena is pretty slow.
Note that the fastest speed I've ever seen on my network is about 240 Mbps, even between "big, mature" computers. Maybe it's the $20 switches that I buy.
Poor Tx performance is board-specific, indicating a hardware issue on Tx path.
running on SN 001 gives
------------------------------------------------------------ Client connecting to 10.0.239.142, TCP port 5001 TCP window size: 20.7 KByte (default) ------------------------------------------------------------ [ 5] local 10.0.239.241 port 54666 connected with 10.0.239.142 port 5001 [ 4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38386 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 230 MBytes 193 Mbits/sec [ 5] 0.0-10.2 sec 55.3 MBytes 45.5 Mbits/sec
One possible culprit is the TX clock signal integrity (on the RGMII side) looks pretty bad. The Tx signal jitter is also correspondingly poor. This could be because of bad PLL stability inside the iMX6, or could be because of ground bounce, power supply unsteadiness, etc.
Note RGMII_REF_CLK also has likewise unstability. Odd. Measuring RGMII_REF_CLK causes bitrate of Tx side to drop dramatically. Seems like we have some correlation here.
Shorting out the termination on RGMII_REF_CLK vastly improves Tx performance:
------------------------------------------------------------ Client connecting to 10.0.239.142, TCP port 5001 TCP window size: 20.7 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.239.241 port 44027 connected with 10.0.239.142 port 5001 [ 5] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38477 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 184 MBytes 154 Mbits/sec [ 5] 0.0-10.0 sec 157 MBytes 131 Mbits/sec ------------------------------------------------------------
However, the signal still shows some jitter. There are probably a couple of things going on inside the PHY that I'm not understanding, so it warrants further investigation with a high speed scope. As a stop-gap, it may be acceptable to short out the termination resistor R21G as it seems the drive strength of the KSZ9021RN isn't good enough to power through it. Currently, only SN001 has the fix.
SN003 with fix:
------------------------------------------------------------ Client connecting to 10.0.239.142, TCP port 5001 TCP window size: 45.5 KByte (default) ------------------------------------------------------------ [ 5] local 10.0.239.241 port 50859 connected with 10.0.239.142 port 5001 [ 4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38485 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 180 MBytes 151 Mbits/sec [ 4] 0.0-10.0 sec 160 MBytes 134 Mbits/sec
SN005 with fix:
Client connecting to 10.0.239.142, TCP port 5001 TCP window size: 53.7 KByte (default) ------------------------------------------------------------ [ 5] local 10.0.239.241 port 51507 connected with 10.0.239.142 port 5001 [ 4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38488 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 179 MBytes 150 Mbits/sec [ 4] 0.0-10.0 sec 159 MBytes 133 Mbits/sec
SN004 with fix:
Client connecting to 10.0.239.142, TCP port 5001 TCP window size: 53.7 KByte (default) ------------------------------------------------------------ [ 5] local 10.0.239.241 port 57324 connected with 10.0.239.142 port 5001 [ 4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38492 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 176 MBytes 148 Mbits/sec [ 4] 0.0-10.0 sec 161 MBytes 135 Mbits/sec
As a stop gap, it's ok, but the signal integrity of the refclk still looks like dogmeat.