Android教程網
  1. 首頁
  2. Android 技術
  3. Android 手機
  4. Android 系統教程
  5. Android 游戲
 Android教程網 >> Android技術 >> 關於Android編程 >> 實習雜記(31):android多dex方案二

實習雜記(31):android多dex方案二

編輯:關於Android編程

這一章是在繼續學習android多dex之前必須要准備的知識

作為一個android開發者,在開發應用時,隨著業務規模發展到一定程度,不斷地加入新功能、添加新的類庫,代碼在急劇的膨脹,相應的apk包的大小也急劇增加, 那麼終有一天,你會不幸遇到這個錯誤:

  1. 生成的apk在android 2.3或之前的機器上無法安裝,提示INSTALL_FAILED_DEXOPT
  2. 方法數量過多,編譯時出錯,提示:
 Conversion to Dalvik format failed:Unable to execute dex: method ID not in [0, 0xffff]: 65536

而問題產生的具體原因如下:

  1. 無法安裝(Android 2.3 INSTALL_FAILED_DEXOPT)問題,是由dexopt的LinearAlloc限制引起的,在Android版本不同分別經歷了4M/5M/8M/16M限制,目前主流4.2.x系統上可能都已到16M, 在Gingerbread或者以下系統LinearAllocHdr分配空間只有5M大小的, 高於Gingerbread的系統提升到了8M。Dalvik linearAlloc是一個固定大小的緩沖區。在應用的安裝過程中,系統會運行一個名為dexopt的程序為該應用在當前機型中運行做准備。dexopt使用LinearAlloc來存儲應用的方法信息。Android 2.2和2.3的緩沖區只有5MB,Android 4.x提高到了8MB或16MB。當方法數量過多導致超出緩沖區大小時,會造成dexopt崩潰。

  2. 超過最大方法數限制的問題,是由於DEX文件格式限制,一個DEX文件中method個數采用使用原生類型short來索引文件中的方法,也就是4個字節共計最多表達65536個method,field/class的個數也均有此限制。對於DEX文件,則是將工程所需全部class文件合並且壓縮到一個DEX文件期間,也就是Android打包的DEX過程中, 單個DEX文件可被引用的方法總數(自己開發的代碼以及所引用的Android框架、類庫的代碼)被限制為65536;
  `dexopt`過程是一個耗時操作,根據大牛的經驗,dex的大小直接影響opt時間。最直觀的想法是,等主界面顯示之後再去加載子Dex,但假如在加載的過程中,有用戶操作調用到了子Dex中的類,就出問題了。甚至一種很常見的情況,ActivityA屬於子Dex,然後放在後台被殺,此時用戶是可以直接通過最近列表返回該Activity的,這種情況下必崩。   鑒於此,opt這個過程需要先了解:這裡有一篇技術文檔:   Dalvik Optimization and Verification with Dexopt

The Dalvik virtual machine was designed specifically for the Android mobile platform. The target systems have little RAM, store data on slow internal flash memory, and generally have the performance characteristics of decade-old desktop systems. They also run Linux, which provides virtual memory, processes and threads, and UID-based security mechanisms.

Dalvik 虛擬機是專門為android 移動平台設計的,它的系統幾乎沒有內存,內部閃存存儲數據非常慢,也沒有桌面系統一般性能特征的那種表現。它是運行在LINUX平台上的,Linux為其提供虛擬內存,進程和線程,以及UID-based安全機制

The features and limitations caused us to focus on certain goals:

這種特點和限制讓我們有下面的改進:

Class data, notably bytecode, must be shared between multiple processes to minimize total system memory usage.類數據,尤其是字節碼,必須在多個進程之間共享,目的是減輕系統內存使用的總大小【加個前綴:經過opt處理】The overhead in launching a new app must be minimized to keep the device responsive.啟動一個新的app所用的開銷必須最小,目的是保證設備可以響應。【翻譯是這樣理解,加個前綴是:經過opt這個過程可以讓APP啟動更快。實際上就是主dex不應該干很多事的】Storing class data in individual files results in a lot of redundancy, especially with respect to strings. To conserve disk space we need to factor this out.把類數據存儲在一個文件裡會造成大量的冗余,尤其是對字符串而言。為了節省磁盤空間,需要把這個因素剔除。這個過程需要opt來處理了。Parsing class data fields adds unnecessary overhead during class loading. Accessing data values (e.g. integers and strings) directly as C types is better.在類加載過程中,解析類數據字段增加了不必要的開銷。訪問數據值如int,string類型的時候,直接用C類型值可能更好。這個過程opt也可以幫你處理。Bytecode verification is necessary, but slow, so we want to verify as much as possible outside app execution.字節碼驗證是必須的,但是這個過程非常的慢,在執行程序之前,我們想盡最大可能多的去驗證。【或者這樣翻譯:所以應該盡可能在程序運行之前就驗證完】Bytecode optimization (quickened instructions, method pruning) is important for speed and battery life.字節碼優化(加快指令,修剪方法)對於速度和電池壽命是非常重要的For security reasons, processes may not edit shared code.出於安全的考慮,這個過程可能不去編輯共享的代碼

The typical VM implementation uncompresses individual classes from a compressed archive and stores them on the heap. This implies a separate copy of each class in every process, and slows application startup because the code must be uncompressed (or at least read off disk in many small pieces). On the other hand, having the bytecode on the local heap makes it easy to rewrite instructions on first use, facilitating a number of different optimizations.

典型的虛擬機做法是實現從一個壓縮文件解壓單個文件的class,並且把它們存儲在Java堆上。這意味著每個類在每一個進程中都是獨立的,這必然會降低啟動的速度,因為在啟動之前代碼必須是已經解壓過的(或者以非常多的小碎片形式從磁盤上讀取),在另一方面,在本地堆上有字節碼,很容易在第一次使用的時候復寫指令,這樣帶來不同的優化過程。

The goals led us to make some fundamental decisions:

這樣的目標讓我們可以做如下的決策:指的是相對於典型的虛擬機而言,opt處理能夠干的事:

Multiple classes are aggregated into a single "DEX" file.多個Class文件聚合到一個dex文件中DEX files are mapped read-only and shared between processes.dex文件映射在多進程裡是只讀和共享的Byte ordering and word alignment are adjusted to suit the local system.字節次順和字節對齊調整,是為了適應某個特定設備的系統Bytecode verification is mandatory for all classes, but we want to "pre-verify" whatever we can.字節碼驗證是必須的,但是我們可以在我們想的任何時候進行 “pre-verify”操作Optimizations that require rewriting bytecode must be done ahead of time.需要重寫的字節碼優化必須在這個時間之前完成【我也沒有搞懂說的是啥】

The consequences of these decisions are explained in the following sections.

下面是對各種決策進行詳細的解釋:

 

VM Operation 虛擬機操作

Application code is delivered to the system in a.jaror.apkfile. These are really just.ziparchives with some meta-data files added. The Dalvik DEX data file is always calledclasses.dex.

應用程序都是被交付成 jar文件或者dex文件,這些文件實質都是zip格式文件的,

Dalyik dex數據文件也被稱作 classes.dex文件

The bytecode cannot be memory-mapped and executed directly from the zip file, because the data is compressed and the start of the file is not guaranteed to be word-aligned. These problems could be addressed by storingclasses.dexwithout compression and padding out the zip file, but that would increase the size of the package sent across the data network.

來自於zip文件的字節碼不能內存映射和直接執行,因為數據是壓縮的並且文件開頭並不保證字節對齊,

在沒有對zip文件進行壓縮和填充的情況下,這些問題對classes.dex的存儲不影響,但是會增加通過網絡傳送數據包的大小

We need to extractclasses.dexfrom the zip archive before we can use it. While we have the file available, we might as well perform some of the other actions (realignment, optimization, verification) described earlier. This raises a new question however: who is responsible for doing this, and where do we keep the output?

在我們使用classes.dex文件之前,我們需要從zip文件中提取它。當我們在讀取文件的時候,我們可能也會做一些調整,優化,驗證等行為,這就會導致一個新的問題,誰復雜做這件事?輸出位置在那裡?

 

Preparation 准備階段

There are at least three different ways to create a "prepared" DEX file, sometimes known as "ODEX" (for Optimized DEX):

至少有三種方法去創建 “prepared”dex文件,有時這種文件被稱為 ODEX ----優化的DEX

The VM does it "just in time". The output goes into a specialdalvik-cachedirectory. This works on the desktop and engineering-only device builds where the permissions on thedalvik-cachedirectory are not restricted. On production devices, this is not allowed.虛擬機會實時處理它。輸出到一個特別的虛擬機緩存文件目錄下。這種權限在桌面上並不是那麼嚴格,或者叫沒有。在生產環境設備裡這個也是沒有的The system installer does it when an application is first added. It has the privileges required to write todalvik-cache.當一個應用程序第一次被加入(系統)進來的時候,系統會安裝它。它有向dalvik緩存寫的特權(權限)The build system does it ahead of time. The relevantjar/apkfiles are present, but theclasses.dexis stripped out. The optimized DEX is stored next to the original zip archive, not indalvik-cache, and is part of the system image.提前構建,相關的jar或者apk文件是存在的,但是classes.dex文件是剝離的,優化的DEX文件不是存儲在虛擬機緩存裡面,是系統鏡像的一部分。

Thedalvik-cachedirectory is more accurately$ANDROID_DATA/data/dalvik-cache. The files inside it have names derived from the full path of the source DEX. On the device the directory is owned bysystem/systemand has 0771 permissions, and the optimized DEX files stored there are owned bysystemand the application's group, with 0644 permissions. DRM-locked applications will use 640 permissions to prevent other user applications from examining them. The bottom line is that you can read your own DEX file and those of most other applications, but you cannot create, modify, or remove them.

虛擬機緩存的目錄一般是:$ANDROID_DATA/data/dalvik-cache

裡面的文件命名都是來源於DEX源文件完整路徑,在一台設備上,這個目錄是被系統擁有,有0771權限,優化DEX文件存在在那裡,被系統擁有,是應用程序這個組裡面的,這個組擁有0644權限。DRM-LOCKED應用程序將使用0644權限,目的是為了防止其他的應用程序去檢查它們。你可以讀自己的DEX文件,或者是其他的大部分DEX文件,但是對其他的DEX文件,沒有權限去 創建、修改、刪除它們

Preparation of the DEX file for the "just in time" and "system installer" approaches proceeds in three steps:

DEX准備階段(包含:虛擬機實時處理和系統的安裝兩個)有三個步驟:

First, the dalvik-cache file is created. This must be done in a process with appropriate privileges, so for the "system installer" case this is done withininstalld, which runs as root.

首先是 虛擬機緩存目錄被創建,這個是需要在一定的權限下做的,他跟系統安裝這個階段是有交叉的

Second, theclasses.dexentry is extracted from the the zip archive. A small amount of space is left at the start of the file for the ODEX header.

從ZIP文件讀取classes.dex文件,然後在它的開頭需要留一些空間,應該是字節對齊吧

Third, the file is memory-mapped for easy access and tweaked for use on the current system. This includes byte-swapping and structure realigning, but no meaningful changes to the DEX file. We also do some basic structure checks, such as ensuring that file offsets and data indices fall within valid ranges.

第三,進行文件的內存映射,方便在當前的系統上進行使用訪問,這裡面的做法包含字節交換和結構調整,但是並沒有改變DEX文件的意義。這裡面還涉及到一些基本的結構檢查,比如確保文件偏移量、數據指標在有效的范圍內。

The build system uses a hairy process that involves starting the emulator, forcing just-in-time optimization of all relevant DEX files, and then extracting the results fromdalvik-cache. The reasons for doing this, rather than using a tool that runs on the desktop, will become more apparent when the optimizations are explained.

Once the code is byte-swapped and aligned, we're ready to go. We append some pre-computed data, fill in the ODEX header at the start of the file, and start executing. (The header is filled in last, so that we don't try to use a partial file.) If we're interested in verification and optimization, however, we need to insert a step after the initial prep.

一旦代碼進行了字節交換和對齊,這個階段就准備好了,會將一些預計算的數據填在DEX文件頭部,然後開始執行。

 

dexopt

We want to verify and optimize all of the classes in the DEX file. The easiest and safest way to do this is to load all of the classes into the VM and run through them. Anything that fails to load is simply not verified or optimized. Unfortunately, this can cause allocation of some resources that are difficult to release (e.g. loading of native shared libraries), so we don't want to do it in the same virtual machine that we're running applications in.

我們想要去驗證和優化DEX文件中所有的類。最簡單最安全的方法是把所有的類都加載到虛擬機運行,任何加載失敗的都是不能驗證和優化的,不幸的是,這樣做將會導致分配的一些資源難以釋放,比如說:加載本地共享庫,所以我們不想在同一個虛擬機上運行應用程序。

The solution is to invoke a program calleddexopt, which is really just a back door into the VM. It performs an abbreviated VM initialization, loads zero or more DEX files from the bootstrap class path, and then sets about verifying and optimizing whatever it can from the target DEX. On completion, the process exits, freeing all resources.

解決辦法就是去喚醒一個程序,他叫做dexopt,這叫是在虛擬機裡面加一個後門。opt程序執行一個簡短的虛擬機初始化過程,從引導類文件路徑加載0個或者多個dex文件,然後設置驗證和優化,不管它是不是目標dex文件。完成之後,進程退出,釋放所有資源。

It is possible for multiple VMs to want the same DEX file at the same time. File locking is used to ensure that dexopt is only run once.

有可能多個虛擬機在同一個時間想去加載同一個DEX文件。文件鎖就會被用到確保dexopt只運行一次。

 

Verification 驗證階段

The bytecode verification process involves scanning through the instructions in every method in every class in a DEX file. The goal is to identify illegal instruction sequences so that we don't have to check for them at run time. Many of the computations involved are also necessary for "exact" garbage collection. SeeDalvik Bytecode Verifier Notesfor more information.

通過指令字節碼驗證過程包括掃描DEX文件在每個類的每個方法,我們的目標是識別 非法的指令序列,這樣我們不需要在運行時檢測它們,對於“精確”垃圾回收,大量的計算也是必須的,在Dalvik Bytecode Vefifier Notes 這裡能看到更多的信息。

For performance reasons, the optimizer (described in the next section) assumes that the verifier has run successfully, and makes some potentially unsafe assumptions. By default, Dalvik insists upon verifying all classes, and only optimizes classes that have been verified. If you want to disable the verifier, you can use command-line flags to do so. See alsoControlling the Embedded VMfor instructions on controlling these features within the Android application framework.

由於性能原因,優化器假設驗證已經成功,做了一些潛在的不安全的假設。默認情況下,虛擬機會驗證所有的類,只優化那些驗證過的類。如果想要跳過驗證階段,可以使用命令行。

Reporting of verification failures is a tricky issue. For example, calling a package-scope method on a class in a different package is illegal and will be caught by the verifier. We don't necessarily want to report it during verification though -- we actually want to throw an exception when the method call is attempted. Checking the access flags on every method call is expensive though. TheDalvik Bytecode Verifier Notesdocument addresses this issue.

驗證失敗報告是一個棘手的問題。例如,調用package-scope方法,對於同一個類,在不同的包下是非法的,這個時候會被驗證階段抓住的,我們不一定要在驗證階段報告,實際上我們是當這個方法企圖去調用的時候,拋出一個異常。對每一個方法調用檢測訪問權限是非常高昂的, Dalvik Bytecode Verfier Notes這裡有強調這個問題。

Classes that have been verified successfully have a flag set in the ODEX. They will not be re-verified when loaded. The Linux access permissions are expected to prevent tampering; if you can get around those, installing faulty bytecode is far from the easiest line of attack. The ODEX file has a 32-bit checksum, but that's chiefly present as a quick check for corrupted data.

類已經驗證成功,ODEX文件會有一個標志的。當再次加載這個類的時候,它們是不會再次驗證的,這個就是這個標志的作用。

 

Linux訪問權限防止被修改,如果你想繞過那些權限,導致安裝了錯誤的字節碼對攻擊而言是最容易的。ODEX文件已經有32位校驗和,但是他主要是作為快速驗證損壞的數據。

 

Optimization 優化階段

Virtual machine interpreters typically perform certain optimizations the first time a piece of code is used. Constant pool references are replaced with pointers to internal data structures, operations that always succeed or always work a certain way are replaced with simpler forms. Some of these require information only available at runtime, others can be inferred statically when certain assumptions are made.

第一次使用某個片段代碼,虛擬機解釋其器通常執行某些優化。常量池引用被 與指向內部數據結構的指針取代,成功的或者某種工作方式會被一種簡單的形式替代。這些需求信息中的一些只有在運行是有訪問權限的,別人可以推斷靜態時的某些假設。

The Dalvik optimizer does the following:

虛擬機優化如下:

For virtual method calls, replace the method index with a vtable index.對虛擬方法的調用,用虛擬索引來替代方法索引For instance field get/put, replace the field index with a byte offset. Also, merge the boolean / byte / char / short variants into a single 32-bit form (less code in the interpreter means more room in the CPU I-cache).對於get/put字段,用一個字節的偏移量替代字段索引。同時,合並boolean/byte/char/short/ 導入到一個32位形式,【更少的代碼翻譯意味著在CPU I-cache有更多的空間】,Replace a handful of high-volume calls, like String.length(), with "inline" replacements. This skips the usual method call overhead, directly switching from the interpreter to a native implementation.Prune empty methods. The simplest example isObject., which does nothing, but must be called whenever any object is allocated. The instruction is replaced with a new version that acts as a no-op unless a debugger is attached.刪除空的方法,簡單的例子就是對象的初始化,什麼都不做的構造方法,但是每當對象分配的時候又必須需要,指令被新版本替代,新的版本中除了調試器無操作,Append pre-computed data. For example, the VM wants to have a hash table for lookups on class name. Instead of computing this when the DEX file is loaded, we can compute it now, saving heap space and computation time in every VM where the DEX is loaded.附加的預先計算數據,例如,虛擬機想要有一個哈希表,去查找類名,並不是在類被加載的時候去計算的,我們可以現在就計算它,節省堆空間和計算時間。

All of the instruction modifications involve replacing the opcode with one not defined by the Dalvik specification. This allows us to freely mix optimized and unoptimized instructions. The set of optimized instructions, and their exact representation, is tied closely to the VM version.

我們可以自由組合優化和非優化指令。

Most of the optimizations are obvious "wins". The use of raw indices and offsets not only allows us to execute more quickly, we can also skip the initial symbolic resolution. Pre-computation eats up disk space, and so must be done in moderation.

大多數優化是非常有成效的,索引和偏移量的使用不僅讓我們執行速度更快,,也讓我們跳過象征性的初始化決策,

預先預計了需要多少磁盤空間。

There are a couple of potential sources of trouble with these optimizations. First, vtable indices and byte offsets are subject to change if the VM is updated. Second, if a superclass is in a different DEX, and that other DEX is updated, we need to ensure that our optimized indices and offsets are updated as well. A similar but more subtle problem emerges when user-defined class loaders are employed: the class we actually call may not be the one we expected to call.

也有一些潛在的問題來自優化。第一,虛擬索引表和字節偏移量是針對變化的時候,當虛擬機更新了。第二,如果在不同的dex中都有同一個父類對象,如果某個dex文件父類發生了改變,我們需要去確保同時優化索引和偏移量。類似的,當開發者使用自己的類加載器的時候,一個更微妙的問題就發生了:我們實際調用的那個類可能不是我們期望的那個類。

These problems are addressed with dependency lists and some limitations on what can be optimized.

這些問題的解決是 靠依賴項和局限性

Dependencies and Limitations依賴項和局限性

The optimized DEX file includes a list of dependencies on other DEX files, plus the CRC-32 and modification date from the originatingclasses.dexzip file entry. The dependency list includes the full path to thedalvik-cachefile, and the file's SHA-1 signature. The timestamps of files on the device are unreliable and not used. The dependency area also includes the VM version number.

優化DEX文件包含一個依賴於其他的DEX文件的列表,從原始的calsses.dex文件加上CRC-32和日期信息。

這個依賴列表 包含 完整虛擬機緩存目錄路徑 ,還有文件的簽名,設備上的文件的時間戳是不可靠的,不能被使用的,

依賴區域還包括虛擬機的版本號。

An optimized DEX is dependent upon all of the DEX files in the bootstrap class path. DEX files that are part of the bootstrap class path depend upon the DEX files that appeared earlier. To ensure that nothing outside the dependent DEX files is available,dexoptonly loads the bootstrap classes. References to classes in other DEX files fail, which causes class loading and/or verification to fail, and classes with external dependencies are simply not optimized.

一個優化的DEX文件取決於所有的引導類路徑的DEX文件,引導類路徑的DEX文件一般在這個優化的DEX路徑之前先出現,要確保外部依賴dex文件都是可以用的,dexopt只加載引導類,

This means that splitting code out into many separate DEX files has a disadvantage: virtual method calls and instance field lookups between non-boot DEX files can't be optimized. Because verification is pass/fail with class granularity, no method in a class that has any reliance on classes in external DEX files can be optimized. This may be a bit heavy-handed, but it's the only way to guarantee that nothing breaks when individual pieces are updated.

Another negative consequence: any change to a bootstrap DEX will result in rejection of all optimized DEX files. This makes it hard to keep system updates small.

代碼分割成許多獨立的DEX文件有一個劣勢:虛方法調用和實例字段的查找,在不是引導類路徑的DEX文件裡面是不能進行優化的,

另外一個不良的後果是:在引導類路徑裡的DEX改變將會引起優化DEX的拒絕,這使得保持系統更小將變得有難度。

Despite our caution, there is still a possibility that a class in a DEX file loaded by a user-defined class loader could ask for a bootstrap class (say, String) and be given a different class with the same name. If a class in the DEX file being processed has the same name as a class in the bootstrap DEX files, the class will be flagged as ambiguous and references to it will not be resolved during verification / optimization. The class linking code in the VM does additional checks to plug another hole; see the verbose description in the VM sources for details (vm/oo/Class.c).

 

If one of the dependencies is updated, we need to re-verify and re-optimize the DEX file. If we can do a just-in-timedexoptinvocation, this is easy. If we have to rely on the installer daemon, or the DEX was shipped only in ODEX, then the VM has to reject the DEX.

The output ofdexoptis byte-swapped and struct-aligned for the host, and contains indices and offsets that are highly VM-specific (both version-wise and platform-wise). For this reason it's tricky to write a version ofdexoptthat runs on the desktop but generates output suitable for a particular device. The safest way to invoke it is on the target device, or on an emulator for that device.


 

Generated DEX

Some languages and frameworks rely on the ability to generate bytecode and execute it. The rather heavydexoptverification and optimization model doesn't work well with that.

We intend to support this in a future release, but the exact method is to be determined. We may allow individual classes to be added or whole DEX files; may allow Java bytecode or Dalvik bytecode in instructions; may perform the usual set of optimizations, or use a separate interpreter that performs on-first-use optimizations directly on the bytecode (which won't be mapped read-only, since it's locally defined).


  1. 上一頁:
  2. 下一頁:
熱門文章
閱讀排行版
Copyright © Android教程網 All Rights Reserved